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POLYNUCLEOTIDE ENCODING A POLYPEPTIDE HAVING 
HEPARANASE ACTIVITY AND EXPRESSION OF SAME IN 

TRANSDUCED CELLS 

5 FIELD AND BACKGROUND OF THE INVENTION 

The present invention relates to a polynucleotide, referred to hereinbelow 
as hpa^ encoding a polypeptide having heparanase activity, vectors including 
same and transduced cells expressing heparanase. The invention further relates 
to a recombinant protein having heparanase activity. 

10 Heparan sulfate proteoglycans: Heparan sulfate proteoglycans (HSPG) 

are ubiquitous macromolecules associated with the cell surface and extra cellular 
matrix (ECM) of a wide range of cells of vertebrate and invertebrate tissues (1-4). 
The basic HSPG structure includes a protein core to which several linear heparan 
sulfate chains are covalently attached. These polysaccharide chains are typically 

15 composed of repeating hexuronic and D-glucosamine disaccharide units that are 
substituted to a varying extent with N- and O-linked sulfate moieties and N- 
linked acetyl groups (1-4). Studies on the involvement of ECM molecules in cell 
attachment, growth and differentiation revealed a central role of HSPG in 
embryonic morphogenesis, angiogenesis, neurite outgrowth and tissue repair (1- 

20 5). HSPG are prominent components of blood vessels (3). In large blood vessels 
they are concentrated mostly in the intima and inner media, whereas in capillaries 
they are found mainly in the subendothelial basement membrane where they 
support proliferating and migrating endothelial cells and stabilize the structure of 
the capillary wall. The ability of HSPG to interact with ECM macromolecules 

25 such as collagen, laminin and fibronectin, and with different attachment sites on 
plasma membranes suggests a key role for this proteoglycan in the self-assembly 
and insolubility of ECM components, as well as in cell adhesion and locomotion. 
Cleavage of the heparan sulfate (HS) chains may therefore result in degradation 
of the subendothelial ECM and hence may play a decisive role in extravasation of 

30 blood-bome cells. HS catabolism is observed in inflanuxiation, wound repair, 
diabetes, and cancer metastasis, suggesting that enzjmfies which degrade HS play 
important roles in pathologic processes. Heparanase activity has been described 
in activated immune system cells and highly metastatic cancer cells (6-8), but 
research has been handicapped by the lack of biologic tools to explore potential 

35 causative roles of heparanase in disease conditions. 

Involvement of Heparanase in Tumor Cell Invasion and Metastasis: 
Circulating tumor cells arrested in the capillary beds of different organs must 
invade the endothelial cell lining and degrade its underlying basement membrane 
(BM) in order to invade into the extravascular tissue(s) where they establish 
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metastasis (9, 1 0). Metastatic tumor cells often attach at or near the intercellular 
junctions between adjacent endothelial cells. Such attachment of the metastatic 
cells is followed by rupture of the junctions, retraction of the endothelial cell 
borders and migration through the breach in the endothelium toward the exposed 
underlying BM (9). Once located between endothelial cells and the BM, the 
invading cells must degrade the subendothelial glycoproteins and proteoglycans 
of the BM in order to migrate out of the vascular compartment. Several cellular 
enzymes (e.g., collagenase IV, plasminogen activator, cathepsin B, elastase, etc.) 
are thought to be involved in degradation of BM (10). Among these enzymes is 
an endo-P-D-glucuronidase (heparanase) that cleaves HS at specific intrachain 
sites (6, 8, 1 1). Expression of a HS degrading heparanase was found to correlate 
with the metastatic potential of mouse lymphoma (11), fibrosarcoma and 
melanoma (8) cells. Moreover, elevated levels of heparanase were detected in 
sera firom metastatic tumor bearing animals and melanoma patients (8) and in 
tumor biopsies of cancer patients (12). 

The control of cell proliferation and tumor progression by the local 
microenvironment, focusing on the interaction of cells Avith the extracellular 
inatrix (ECM) produced by cultured corneal and vascular endothelial cells, was 
investigated previously by the present inventors. This cultured ECM closely 
resembles the subendpthelium in vivo in its morphological appearance and 
molecular composition. It contains collagens (mostly type III and IV, v^dth 
smaller amounts of types I and V), proteoglycans (mostly heparan sulfate- and 
dermatan sulfate- proteoglycans, with smaller amounts of chondroitin sulfate 
proteoglycans), laminin, fibronectin, entactin and elastin (13, 14). The ability of 
cells to degrade HS in the cultured ECM was studied by allowing cells to interact 
with a metabolically sulfate labeled ECM, followed by gel filtration (Sepharose 
6B) analysis of degradation products released into the culture medium (11). 
While intact HSPG are eluted next to the void volume of the column (Kav<0.2, 
Mr - 0.5x1 0^), labeled degradation fiagments of HS side chains are eluted more 
toward the Vt of the column (0.5<kav<0.8, Mr =5-7x103) (1 1). 

The heparanase inhibitory, effect of various non-anticoagulant species of 
heparin that might be of potential use in preventing extravasation of blood-bome 
cells was also investigated by the present inventors. Inhibition of heparanase was 
best achieved by heparin species containing 1 6 sugar units or more and having 
sulfate groups at both the N and O positions. While O-desulfation abolished the 
heparanase inhibiting effect of heparin, O-sulfated, N-acetylated heparin retained 
a high inhibitory activity, provided that the N-substituted molecules had a 
molecular size of about 4,000 daltons or more (7). Treatment of experimental 
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animals with heparanase inhibitors (e.g., non-anticoagulant species of heparin) 
markedly reduced (>90%) the incidence of lung metastases induced by B16 
melanoma, Lewis lung carcinoma and mammary adenocarcinoma cells (7, 8, 16). 
Heparin fractions with high and low affinity to anti-thrombin III exhibited a 

5 comparable high anti-metastatic activity, indicating that the heparanase inhibiting 
activity of heparin, rather than its anticoagulant activity, plays a role in the anti- 
metastatic properties of the polysaccharide (7). 

Heparanase activity in the urine of cancer patients: In an attempt to 
further elucidate the involvement of heparanase in tumor progression and its 

10 relevance to human cancer, urine samples for heparanase activity were screened 
(16a). Heparanase activity was detected in the urine of some, but not all, cancer 
patients. High levels of heparanase activity were determined in the urine of 
patients with an aggressive metastatic disease and there was no detectable activity 
in the urine of healthy donors . 

15 Heparanase activity was also found in the urine of 20% of normal and 

microalbuminuric insulin dependent diabetes mellitus (IDDM) patients, most 
likely due to diabetic nephropathy, the most important single disorder leading to 
renal failure in adults. 

r 

Possible involvement of heparanase in tumor angiogenesis: Fibroblast 

20 growth factors are a family of stmcturally related polypeptides characterized by 
high affinity to heparin (17). They are highly mitogenic for vascular endothelial 
cells and are among the most potent inducers of neovascularization (17, 18). 
Basic fibroblast growth factor (bFGF) has been extracted firom the subendothelial 
ECM produced in vitro (19) and fi'om basement membranes of the cornea (20), 

25 suggesting that ECM may serve as a reservoir for bFGF. Immunohistochemical 
staining revealed the localization of bFGF in basement membranes of diverse 
tissues and blood vessels (21). Despite the ubiquitous presence of bFGF in 
normal tissues, endothelial cell proliferation in these tissues is usually very low, 
suggesting that bFGF is somehow sequestered firom its site of action. Studies on 

30 the interaction of bFGF with ECM revealed that bFGF binds to HSPG in the 
ECM and can be released in an active form by HS degrading enzymes (15, 20, 
22). It was demonstrated that heparanase activity expressed by platelets, mast 
cells, neutrophils, and lymphoma cells is involved in release of active bFGF from 
ECM and basement membranes (23), suggesting that heparanase activity may 

35 not only fimction in cell migration and invasion, but may also elicit an indirect 
neovascular response. These results suggest that the ECM HSPG provides a 
natural storage depot for bFGF and possibly other heparin-binding growth 
promoting factors (24, 25). Displacement of bFGF from its storage within 
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basement membranes and ECM may therefore provide a novel mechanism for 
induction of neovascularization in normal and pathological situations. 

Recent studies indicate that heparin and HS are involved in binding of 
bFGF to high affinity cell surface receptors and in bFGF cell signaling (26, 27), 
Moreover, the size of HS required for optimal effect was similar to that of HS 
fragments released by heparanase (28). Similar results were obtained with 
vascular endothelial cells growth factor (VEGF) (29), suggesting the operation of 
a dual receptor mechanism involving HS in cell interaction with heparin-binding 
growth factors. It is therefore proposed that restriction of endothelial cell growth 
factors in ECM prevents their systemic action on the vascular endothelium, thus 
maintaining a very low rate of endothelial cells turnover and vessel growth. On 
the other hand, release of bFGF from storage in ECM as a complex with HS 
fragment, may elicit localized endothelial cell proliferation and 
neovascularization in processes such as wound healing, inflammation and tumor 
development (24, 25). 

Expression of heparanase by cells of the immune system: Heparanase 
activity correlates with the ability of activated cells of the immune system to 
leave the circulation and elicit both inflammatory and autoimmune responses. 
Interaction of platelets, granulocj^es, T and B lymphocytes, macrophages and 
mast cells with the subendothelial ECM is associated with degradation of HS by a 
specific heparanase activity (6). The enzyme is released from intracellular 
compartments (e.g., lysosomes, specific granules, etc.) in response to various 
activation signals (e.g., thrombin, calcium ionophore, immune complexes, 
antigens, mitogens, etc.), suggesting its regulated involvement in inflammation 
and cellular immunity. 

Some of the observations regarding the heparanase enzyme were 
reviewed in reference No. 6 and are listed hereinbelow: 

First, a proteolytic activity (plasminogen activator) and heparanase 
participate synergistically in sequential degradation of the ECM HSPG by 
inflammatory leukocytes and malignant cells. 

Second, a large proportion of the platelet heparanase exists in a latent 
form, probably as a complex with chondroitin sulfate. The latent enzyme is 
activated by tumor cell-derived factor(s) and may then facilitate cell invasion 
through the vascular endothelium in the process of tumor metastasis. 

Third, release of the platelet heparanase from a-granules is induced by a 
strong stimulant (i.e., thrombin), but not in response to platelet activation on 
ECM. 
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Fourth, the neutrophil heparanase is preferentially and readily released in 
response to a threshold activation and upon incubation of the cells on ECM. 

Fifth, contact of neutrophils with ECM inhibited release of noxious 
enzymes (proteases, lysozyme) and oxygen radicals, but not of enzymes 
5 (heparanase, gelatinase) which may enable diapedesis. This protective role of the 
subendothelial ECM was observed when the cells were stimulated with soluble 
factors but not with phagocytosable stimulants. 

Sixth, intracellular heparanase is secreted within minutes after exposure of 
T cell lines to specific antigens, 
10 Seventh, mitogens (Con A, LPS) induce synthesis and secretion of 

heparanase by normal T and B lymphocytes maintained in vitro. T lymphocyte 
heparanase is also induced by immunization with antigen in vivo. 

Eighth, heparanase activity is expressed by pre-B lymphomas and B- 
lymphomas, but not by plasmacytomas and resting normal B lymphocytes. 
15 Ninth, heparanase activity is expressed by activated macrophages during 

incubation with ECM, but there was little or no release of the enzyme into the 
incubation medium. Similar results were obtained with human myeloid leukemia 
cells induced to differentiate to mature macrophages. 

Tenth, T-cell mediated delayed type hypersensitivity and experimental 
20 autoimmunity are suppressed by low doses of heparanase inhibiting non- 
anticoagulant species of heparin (30), 

Eleventh, heparanase activity expressed by platelets, neutrophils and 
metastatic tumor cells releases active bFGF firom ECM and basement 
membranes. Release of bFGF from storage in ECM may elicit a localized 
25 neovascular response in processes such as wound healing, inflammation and 
tumor development. 

Twelfth, among the breakdown products of the ECM generated by 
heparanase is a tri-sulfated disaccharide that can inhibit T-cell mediated 
inflammation in vivo (31). This inhibition was associated with an inhibitory 
30 effect of the disaccharide on the production of biologically active TNFa by 
activated T cells in vitro (31). 

Other potential therapeutic applications: Apart from its involvement in 
tumor cell metastasis, inflammation and autoimmunity, mammalian heparanase 
may be applied to modulate: bioavailability of heparin-binding grovi^h factors 
35 (15); cellular responses to heparin-binding growth factors (e.g., bFGF, VEGF) 
and cytokines (IL-8) (31a, 29); cell interaction with plasma lipoproteins (32); 
cellular susceptibility to certain viral and some bacterial and protozoa infections 
(33, 33a, 33b); and disintegration of amyloid plaques (34). Heparanase may thus 
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prove useful for conditions such as wound healing, angiogenesis, restenosis, 
atherosclerosis, inflammation, neurodegenerative diseases and viral infections. 
Mammalian heparanase can be used to neutralize plasma heparin, as a potential 
replacement of protamine. Anti-heparanase antibodies may be applied for 
immunodetection and diagnosis of micrometastases, autoimmune lesions and 
renal failure in biopsy specimens, plasma samples, and body fluids. Common use 
in basic research is expected. 

The identification of the hpa gene encoding for heparanase enzyme will 
enable the production of a recombinant enzyme in heterologous expression 
systems. Availability of the recombinant protein will pave the way for solving 
the protein stmcture function relationship and will provide a tool for developing 
new inhibitors. 

Viral Infection: The presence of heparan sulfate on cell surfaces have 
been shown to be the principal requirement for the binding of Herpes Simplex 
(33) and Dengue (33a) viruses to cells and for subsequent infection of the cells. 
Removal of the cell surface heparan sulfate by heparanase may therefore abolish 
virus infection. In fact, treatment of cells with bacterial heparitinase (degrading 
heparan sulfate) or heparinase (degrading heparan) reduced the binding of two 
related animal herpes viruses to cells and rendered the cells at least partially 
resistant to virus infection (33). There are~ some indications that the cell surface 
heparan sulfate is also involved in HIV infection (33b). 

Neurodegenerative diseases: Heparan sulfate proteoglycans were 
identified in the prion protein amyloid plaques of Genstmann-Straussler 
Syndrome, Creutzfeldt- Jakob disease and Scrape (34). Heparanase may 
disintegrate these amyloid plaques which are also thought to play a role in the 
pathogenesis of Alzheimer's disease. 

Restenosis and Atherosclerosis: Proliferation of arterial smooth muscle 
cells (SMCs) in response to endothelial injury and accumulation of cholesterol 
rich lipoproteins are basic events in the pathogenesis of atherosclerosis and 
restenosis (35). Apart from its involvement in SMC proliferation (i.e., low 
affinity receptors for heparin-binding growth factors), HS is also involved in 
lipoprotein binding, retention and uptake (36). It was demonstrated that HSPG 
and lipoprotein lipase participate in a novel catabolic pathway that may allow 
substantial cellular and interstitial accumulation of cholesterol rich lipoproteins 
(32). The latter pathway is expected to be highly atherogenic by promoting 
accumulation of apoB and apoE rich lipoproteins (i.e. LDL, VLDL, 
chylomicrons), independent of feed back inhibition by the cellular sterol content. 
Removal of SMC HS by heparanase is therefore expected to inhibit both SMC 
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proliferation and lipid accumulation and thus may halt the progression of 
restenosis and atherosclerosis. 

There is thus a widely recognized need for, and it would be highly 
advantageous to have a polynucleotide encoding a polypeptide having heparanase 
5 activity, vectors including same, transduced cells expressing heparanase and a 
recombinant protein having heparanase activity. 

SUMMARY OF THF INVFNTTON 

According to the present invention there is provided a polynucleotide, 

10 referred to hereinbelow as hpa^ hpa cDNA or hpa gene, encoding a polypeptide 
having heparanase activity, vectors including same, transduced cells expressing 
heparanase and a recombinant protein having heparanase activity. 

Cloning of the human hpa gene which encodes heparanase, and expression 
of recombinant heparanase by transfected host cells is reported. 

15 A purified preparation of heparanase isolated from human hepatoma cells 

was subjected to tryptic digestion and microsequencing. The YGPDVGQPR 
(SEQ ID NO: 8) sequence revealed was used to screen EST databases for 
homology to the corresponding back translated DNA sequence. Two closely 

• _ 

related EST sequences were identified and were thereafter found to be identical. 

20 Both clones contained an insert of 1020 bp which included an open reading frame 
of 973 bp followed by a 27 bp of 3* untranslated region and a Poly A tail. 
Translation start site was not identified. 

Cloning of the missing 5' end of hpa was performed by PCR amplification 
of DNA from placenta Marathon RACE cDNA composite using primers selected 

25 according to the EST clones sequence and the linkers of the composite. A 900 bp 
PCR fragment, partially overlapping with the identified 3' encoding EST clones 
was obtained. The joined cDNA fragment (hpa), 1721 bp long (SEQ ID NO:9), 
contained an open reading frame which encodes a polypeptide of 543 amino 
acids (SEQ ID NO: 10) with a calculated molecular weight of 61,192 daltons. 

30 Cloning an extended 5' sequence was enabled from the human SK-hepl 

cell line by PCR amplification using the Marathon RACE. The 5* extended 
sequence of the SK-hepl hpa cDNA was assembled with the sequence of the hpa 
cDNA isolated from human placenta (SEQ ID NO:9). The assembled sequence 
contained an open reading frame, SEQ ID NOs: 13 and 15, which encodes, as 

35 shown in SEQ ID NOs: 14 £ind 15, a polypeptide of 592 amino acids with a 
calculated molecular v^eight of 66,407 daltons. 

The ability of the hpa gene product to catalyze degradation of heparan 
sulfate in an in vitro assay was examined by expressing the entire open reading 
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frame of hpa in insect cells, using the Baculovirus expression system. Extracts 
and conditioned media of cells infected with virus containing the hpa gene, 
demonstrated a high level of heparan sulfate degradation activity both towards 
soluble ECM-derived HSPG and intact ECM. This degradation activity was 
inhibited by heparin, which is another substrate of heparanase. Cells infected 
with a similar constract containing no hpa gene had no such activity, nor did non- 
infected cells. The ability of heparanase expressed from the extended 5' clone 
towards heparin was demonstrated in a mammalian expression system. 

The expression pattern of hpa RNA in various tissues and cell lines was 
investigated using RT-PCR. It was found to be expressed only in tissues and 
cells previously known to have heparanase activity. 

A panel of monochromosomal human/CHO and human/mouse somatic 
cell hybrids was used to localize the human heparanase gene to human 
chromosome 4. The newly isolated heparanase sequence can be used to identify 
a chromosome region harboring a human heparanase gene in a chromosome 
spread. 

According to further features in preferred embodiments of the invention 
described below, there is provided a polynucleotide fragment which includes a 
polynucleotide sequence encoding a polypeptide having heparanase catal>^ic 
activity. 

According to still further features in the described preferred embodiments 
the polynucleotide fragment includes nucleotides 63-1691 of SEQ ID NO:9 or 
nucleotides 139-1869 of SEQ ID NO: 13, which encode the entire human 
heparanase enzyme. 

According to still further features in the described preferred embodiments 
there is provided a polynucleotide fragment which includes a polynucleotide 
sequence capable of hybridizing with hpa cDNA, especially with nucleotides 1- 
721 of SEQIDNO:9. 

According to still further features in the described preferred embodiments 
the polynucleotide sequence which encodes the polypeptide having heparanase 
activity shares at least 60 % homology, preferably at least 70 % homology, more 
preferably at least 80 % homology, most preferably at least 90 % homology with 
SEQ IDNOs:9 or 13. 

According to still further features in the described preferred embodiments 
the polynucleotide fragment according to the present invention includes a portion 
(fragment) of SEQ ID NOs:9, or 13. For example, such fragments could include 
nucleotides 63-721 of SEQ ID NO:9 and/or a segment of SEQ ID NO:9 which 
encodes a polypeptide having the heparanase catalytic activity. 
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According to still further features in the described preferred embodiments 
the polypeptide encoded by the polynucleotide fragment includes an amino acid 
sequence as set forth in SEQ ID NOs:10 or 14 or a functional part thereof. 

According to still further features in the described preferred embodiments 
the poljmucleotide sequence encodes a polypeptide having heparanase activity, 
which shares at least 60 % homology, preferably at least 70 % homology, more 
preferably at least 80 % homology, most preferably at least 90 % homology with 
SEQ ID NOs:10or 14. 

According to still further features in the described preferred embodiments 
the polynucleotide fragment encodes a polypeptide having heparanase activity, 
which may therefore be allelic, species and/or induced variant of the amino acid 
sequence set forth in SEQ ID NOs:10 or 14. It is understood that any such 
variant may also be considered a homolog. 

According to still further features in the described preferred embodiments 
there is provided a single stranded polynucleotide fragment which includes a 
polynucleotide sequence complementary to at least a portion of a polynucleotide 
strand encoding a polypeptide having heparanase catalytic activity as described 
above. 

According to still further features in the described preferred embodiments 
there is provided a vector including a polynucleotide sequence encoding a 
polypeptide having heparanase catalytic activity. 

The vector may be of any suitable type including but not limited to a 
phage, virus, plasmid, phagemid, cosmid, bacmid or even an artificial 
chromosome. The polynucleotide sequence encoding a polypeptide having 
heparanase catalytic activity may include any of the above described 
polynucleotide fragments. 

According to still further features in the described preferred embodiments 
there is provided a host cell which includes an exogenous polynucleotide 
fragment including a polynucleotide sequence encoding a polypeptide having 
heparanase catalytic activity. 

The exogenous polynucleotide firagment may be any of the above 
described fragments. The host cell may be of any type such as prokaryotic cell, 
eukaryotic cell, a cell line, or a cell as a portion of a multicellular organism (e.g., 
cells of a transgenic organism). 

According to still further features in the described preferred embodiments 
there is provided a recombinant protein including a polypeptide having 
heparanase catalytic activity. 
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According to still further features in the described preferred embodiments 
there is provided a pharmaceutical composition comprising as an active 
ingredient a recombinant protein having heparanase catalytic activity- 
According to still further features in the described preferred embodiments 
5 there is provided a medical equipment comprising a medical device containing, 
as an active ingredient a recombinant protein having heparanase catalytic 
activity. 

According to still further features in the described preferred embodiments 
there is provided a heparanase overexpression system comprising a cell 

10 overexpressing heparanase catalytic activity. 

According to still further features in the described preferred embodiments 
there is provided a method of identifying a chromosome region harboring a 
human heparanase gene in a chromosome spread comprising the steps of (a) 
hybridizing the chromosome spread with a tagged polynucleotyde probe 

15 encoding heparanase; (b) washing the chromosome spread, thereby removing 
excess of non-hybridized probe; and (c) searching for signals associated with said 
hybridized tagged polynucleotyde probe, wherein detected signals being 
indicative of a chromosome region harboring a hiiman heparanase gene. 

The present invention can be used to develop new drugs to inhibit tumor 

20 cell metastasis, inflammation and autoimmunity. The identification of the hpa 
gene encoding for heparanase enzyme enables the production of a recombinant 
enzyme in heterologous expression systems. 

RRTFF PFSCRTPTTON OF THR DRAWING S 

25 The invention herein described, by way of example only, with reference to 

the accompanying drawings, wherein: 

FIG. 1 presents nucleotide sequence and deduced amino acid sequence of 
hpa cDNA. A single nucleotide difference at position 799 (A to T) between the 
EST (Expressed Sequence Tag) and the PGR amplified cDNA (reverse 

30 transcribed RNA) and the resulting amino acid substitution (Tyr to Phe) are 
indicated above and below the substituted unit, respectively. Cysteine residues 
and the poly adenylation consensus sequence are underlined. The asterisk 

denotes the stop codon TGA. 

FIG. 2 demonstrates degradation of soluble sulfate labeled HSPG substrate 
35 by lysates of High Five cells infected with pFhpal virus. Lysates of High Five 
cells that were infected with pF hpal virus (•) or control pF2 virus (□) were 
incubated (18 h, 37 ^C) with sulfate labeled ECM-derived soluble HSPG (peak 
1). The incubation medium was then subjected to gel filtration on Sepharose 6B. 
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Low molecular weight HS degradation fragments (peak II) were produced only 
during incubation with the pVhpal infected cells, but there was no degradation of 
the HSPG substrate (a) by lysates of pF2 infected cells. 

FIGs. 3a-b demonstrate degradation of soluble sulfate labeled HSPG 
5 substrate by the culture medium of pFhpal and pVhpaA infected cells. Culture 
media of High Five cells infected with pFhpal (3a) or pFhpa4 (3b) viruses (•), or 
with control viruses (□) were incubated (18 h, 37 »€) with sulfate labeled ECM- 
derived soluble HSPG (peak I, The incubation media were then subjected to 
gel filtration on Sepharose 6B. Low molecular weight HS degradation fragments 

10 (peak 11) were produced only during incubation with the hpa gene containing 
vhruses. There was no degradation of the HSPG substrate by the culture medium 
of cells infected with control viruses. 

FIG. 4 presents size fractionation of heparanase activity expressed by 
pFhpal infected cells. Culture medium of pFhpal infected High Five cells was 

15 applied onto a 50 kDa cut-off membrane. Heparanase activity (conversion of the 
peak I substrate, (^) into peak II HS degradation fragments) was found in the 
high (> 50 kDa) (•), but not low (< 50 kDa) (o) molecular weight compartment, 

FIGs. 5a-b demonstrate the effect of heparin on heparanase activity 
expressed by pFhpal and pFhpaA infected High Five cells. Culture media of 

20 pFhpal (5 a) and pF hpa4 (5b) infected High Five cells were incubated (18 h, 37 
HZ) with sulfate labeled ECM-derived soluble HSPG (peak I, <^) in the absence (• 
) or presence (a) of 10 ng/ml heparin. Production of low molecular weight HS 
degradation fragments was completely abolished in the presence of heparin, a 
potent inhibitor of heparanase activity (6, 7). 

25 FIGs. 6a-b demonstrate degradation of sulfate labeled intact ECM by vims 

infected High Five and Sf21 cells. High Five (6a) and Sf2l (6b) cells were plated 
on sulfate labeled ECM and infected (48 h, 28 ^C) with pFhpaA (•) or control 
pFl (a) viruses. Control non-infected Sf21 cells (r) were plated on the labeled 
ECM as well. The pH of the cultured medium was adjusted to 6.0 - 6.2 followed 

30 by 24 h incubation at 37 «>C. Sulfate labeled material released into the incubation 
medium was analyzed by gel filtration on Sepharose 6B- HS degradation 
fragments were produced only by cells infected with the hpa containing virus. 

FIG. 7a-b demonstrate degradation of sulfate labeled intact ECM by virus 
infected cells. High Five (7a) and Sf21 (7b) cells were plated on sulfate labeled 

35 ECM and infected (48 h, 28 ^C) with pF hpaA (•) or control pFl (n) viruses. 
Control non-infected Sf21 cells (r) were plate on labeled ECM as well. The pH 
of the cultured medium was adjusted to 6.0 - 6.2, followed by 48 h incubation at 
28 ^C. Sulfate labeled degradation fragments released into the incubation 
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medium was analyzed by gel filtration on Sepharose 6B. HS degradation 
fragments were produced only by cells infected with the hpa containing virus. 

FIGs. Sa-b demonstrate degradation of sulfate labeled intact ECM by the 
culture medium of pFhpa4 infected cells. Culture media of High Five (8a) and 
Sf21 (8b) cells that were infected with pFhpaA (•) or control pFl (□) virases were 
incubated (48 h, 37 pH 6.0) with intact sulfate labeled ECM. The ECM was 
also incubated with the culture medium of control non-infected Sf21 cells (r). 
Sulfate labeled material released into the reaction mixture was subjected to gel 
filtration analysis. Heparanase activity was detected only in the culture medium 
of p¥hpa4 infected cells. 

FIGs. 9a-b demonstrate the effect of heparin on heparanase activity in the 
culture medium of pFhpa4 infected cells. Sulfate labeled ECM was incubated 
(24 h, 37 °C, pH 6.0) with culture medium of pFhpaA infected High Five (9a) and 
Sf21 (9b) cells in the absence (•) or presence (V) of 10 fig/ml heparin. Sulfate 
labeled material released into the incubation medium was subjected to gel 
filtration on Sepharose 6B. Heparanase activity (production of peak II HS 
degradation fragments) was completely inhibited in the presence of heparin. 

FIGs. lOa-b demonstrate purification of recombinant heparanase on 
heparin-Sepharose. Culture medium of S£21 cells infected with pFhpa4 virus 
was subjected to heparin-Sepharose chromatography. Elution of fractions was 

■ 

performed with 0.35 - 2 M NaCl gradient (^), Heparanase activity in the eluted 
fractions is demonstrated in Figure 10a (•). Fractions 15-28 were subjected to 
1 5% SDS-polyacrylamide gel electrophoresis followed by silver nitrate staining. 
A correlation is demonstrated between a major protein band (MW ^ 63,000) in 
fractions 19-24 and heparanase activity. 

FIGs. lla-b demonstrate purification of recombinant heparanase on a 
Superdex 75 gel filtration column. Active fractions eluted from heparin- 
Sepharose (Figure 10a) were pooled, concentrated and applied onto Superdex 75 
FPLC colxmin. Fractions were collected and aliquots of each fraction were tested 
for heparanase activity (C, Figure 11a) and analyzed by SDS-polyacrylamide gel 
electrophoresis followed by silver nitrate staining (Figure 1 lb). A correlation is 
seen between the appearance of a major protein band (MW — 63,000) in fractions 
4-7 and heparanase activity. 

FIGs. 12a-e demonstrate expression of the hpa gene by RT-PCR with total 
RNA from human embryonal tissues (12a), human extra-embryonal tissues (12b) 
and cell lines from different origins (12c-e). RT-PCR products using hpa specific 
primers (I), primers for GAPDH housekeeping gene (II), and control reactions 
without reverse transcriptase demonstrating absence of genomic DNA or other 
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contamination in RNA samples (III). M- DNA molecular weight marker VI 
(Boehringer Mannheim). For 12a: lane 1 - neutrophil cells (adult), lane 2 - 
muscle, lane 3 - thymus, lane 4 - heart, lane 5 - adrenal. For 1 2b: lane 1 - kidney, 
lane 2 - placenta (8 weeks), lane 3 - placenta (II weeks), lanes 4-7 - mole 
(complete hydatidifoim mole), lane 8 - cytotrophoblast cells (freshly isolated), 
lane 9 - cytotrophoblast cells (1.5 h m vitro), lane 10 - cytotrophoblast cells (6 h 
in vitro), lane 1 1 - cytotrophoblast cells (18 h /« vitro), lane 12 - cytotrophoblast 
cells (48 h in vitro). For 12c: lane 1 - JAR bladder cell line, lane 2 - NCITT 
testicular tumor cell line, lane 3 - SW-480 human hepatoma cell line, lane 4 - 
HTR (cytotrophoblasts transformed by SV40), lane 5 - HPTLP-I hepatocellular 
carcinoma cell line, lane 6 - EJ-28 bladder carcinoma cell line. For 12d: lane 1 - 
SK-hep-1 human hepatoma cell line, lane 2 - DAMI human megakaryocyte cell 
line, lane 3 - DAMI cell line + PMA, lane 4 - CHRP cell line + PMA, lane 5 - 
CHRP cell line. For 12e: lane 1 - ABAE bovine aortic endothelial cells, lane 2 - 
1063 human ovarian cell line, lane 3 - human breast carcinoma MDA435 cell 
line, lane 4 - human breast carcinoma MDA231 cell line. 

FIG. 13 presents a comparison between nucleotide sequences of the 
human hpa and a mouse EST cDNA fragment (SEQ ID NO: 12) which is 80 % 
homologous to the 3' end (starting at nucleotide 1066 of SEQ ID NO:9) of the 
human hpa. The aligned termination codons are underlined. 

FIG. 14 demonstrates the chromosomal localization of the hpa gene. PGR 
products of DNA derived from somatic cell hybrids and of genomic DNA of 
hamster, mouse and human of were separated on 0.7 % agarose gel following 
amplification with hpa specific primers. Lane 1 - Lambda DNA digested with 
BstElt, lane 2 - no DNA control, lanes 3 - 29, PGR amplification products. 
Lanes 3-5 - human, mouse and hamster genomic DNA, respectively. Lanes 6- 
29, human monochromosomal somatic cell hybrids representing chromosomes 1 - 
22 and X and Y, respectively. Lane 30 - Lambda DNA digested with BstEll. An 
amplification product of approximately 2.8 Kb is observed only in lanes 5 and 9, 
representing human genomic DNA and DNA derived from cell hybrid carrying 
human chromosome 4, respectively. These results demonstrate that the hpa gene 
is localized in human chromosome 4. 

DESCRIPTION OF THE PR EFERRFH FMROnTMF>JT.^ 

The present invention is of a polynucleotide, referred to hereinbelow 
interchangeably as hpa, hpa cDNA or hpa gene, encoding a polypeptide having 
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heparanase activity, vectors including same, transduced cells expressing 
heparanase and a recombinant protein having heparanase activity. 

Before explaining at least one embodiment of the invention in detail, it is 
to be understood that the invention is not limited in its application to the details of 
construction and the arrangement of the components set forth in the following 
description or illustrated in the drawings. The invention is capable of other 
embodiments or of being practiced or carried out in various ways. Also, it is to 
be understood that the phraseology and terminology employed herein is for the 
purpose of description and should not be regarded as limiting. 

The present invention can be used to develop treatments for various 
diseases, to develop diagnostic assays for Aese diseases and to provide new tools 
for basic research especially in the fields of medicine and biology. 

Specifically, the present invention can be used to develop new drugs to 
inhibit tumor cell metastasis, inflammation and autoimmunity. The identification 
of the hpa gene encoding for the heparanase enzyme enables the production of a 
recombinant enzyme in heterologous expression systems. 

Furthermore, the present invention can be used to modulate bioavailability 
' of heparin-binding growth factors, cellular responses to heparin-binding growth 
factors (e.g., bFGF, VEGF) and cytokines (IL-8), cell interaction with plasma 
lipoproteins, cellular susceptibility to viral, protozoa and some bacterial 
infections, and disintegration of neurodegenerative plaques. Recombinant 
heparanase is thus a potential treatment for wound healing, angiogenesis, 
restenosis, atherosclerosis, inflammation, neurodegenerative diseases (such as, 
for example, Genstmann-Straussler Syndrome, Creutzfeldt-Jakob disease, Scrape 
and Alzheimer's disease) and certain viral and some bacterial and protozoa 
infections. Recombinant heparanase can be used to neutralize plasma heparin, as 
a potential replacement of protamine. 

As used herein, the term "modulate" includes substantially inhibiting, 
slowing or reversing the progression of a disease, substantially ameliorating 
clinical symptoms of a disease or condition, or substantially preventing the 
appearance of clinical symptoms of a disease or condition. A "modulator" 
therefore includes an agent which may modulate a disease or condition. 
Modulation of viral, protozoa and bacterial infections includes any effect which 
substantially intermpts, prevents or reduces any viral, bacterial or protozoa 
activity and/or stage of the virus, bacterium or protozoon life cycle, or which 
reduces or prevents infection by the virus, bacterium or protozoon in a subject, 
such as a human or lower animal. 
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As used herein, the term "wound" includes any injury to any portion of the 
body of a subject including, but not limited to, acute conditions such as thermal 
bums, chemical bums, radiation bums, bums caused by excess exposure to 
ultraviolet radiation such as sunburn, damage to bodily tissues such as the 
perineum as a result of labor and childbirth, including injuries sustained during 
medical procedures such as episiotomies, trauma-induced injuries including cuts, 
those injuries sustained in automobile and other mechanical accidents, and those 
caused by bullets, knives and other weapons, and post-surgical injuries, as well as 
chronic conditions such as pressure sores, bedsores, conditions related to diabetes 
and poor circulation, and all types of acne, etc. 

Anti-heparanase antibodies, raised against the recombinant enzyme, would 
be useful for immunodetection and diagnosis of micrometastases, autoimmune 
lesions and renal failure in biopsy specimens, plasma samples, and body fluids. 
Such antibodies may also serve as neutralizing agents for heparanase activity. 

Cloning of the human hpa gene encoding heparanase and expressing 
recombinant heparanase by transfected cells is herein reported. This is the first 
mammalian heparanase gene to be cloned. 

A purified preparation of heparanase isolated firom human hepatoma cells 
was subjected to tryptic digestion and microsequencing. 

The YGPDVGQPR (SEQ ID NO:8) sequence revealed was used to screen 
EST databases for homology to the corresponding back translated DNA 
sequences. Two closely related EST sequences were identified and were 
thereafter found to be identical. 

Both clones contained an insert of 1020 bp which includes an open 
reading frame of 973 bp followed by a 3' untranslated region of 27 bp and a Poly 
A tail, whereas a translation start site was not identified. 

Cloning of the missing 5' end was performed by PCR amplification of 
DNA from placenta Marathon RACE cDNA composite using primers selected 
according to the EST clones sequence and the linkers of the composite, 

A 900 bp PCR fragment, partially overlapping with the identified 3* 
encoding EST clones was obtained. The joined cDNA fragment (hpa)^ 1721 bp 
long (SEQ ID NO:9), contained an open reading frame which encodes, as shown 
in Figure 1 and SEQ ID NO:ll, a polypeptide of 543 amino acids (SEQ ID 
NO: 10) with a calculated molecular weight of 61,192 daltons. 

A single nucleotide difference at position 799 (A to T) between the EST 
clones and the PCR amplified cDNA was observed. This difference results in a 
single amino acid substitution (Tyr to Phe) (Figure 1). Furthermore, the 
published EST sequences contained an unidentified nucleotide, which following 
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DNA sequencing of both the EST clones was resolved into two nucleotides (G 
and C at positions 1630 and 1631 in SEQ ID NO:9, respectively). 

The ability of the hpa gene product to catalyze degradation of heparan 
sulfate in an in vitro assay was examined by expressing the entire open reading 
5 frame in insect cells, using the Baculovirus expression system. 

Extracts and conditioned media of cells infected with virus containing the 
hpa gene, demonstrated a high level of heparan sulfate degradation activity both 
towards soluble ECM-derived HSPG and intact ECM, which was inhibited by 
heparin, while cells infected with a similar construct containing no hpa gene had 
10 no such activity, nor did non-infected cells. 

The expression pattern of hpa RNA in various tissues and cell lines was 
investigated using RT-PCR. It was found to be expressed only in tissues and 
cells previously known to have heparanase activity. 

Cloning an extended 5' sequence was enabled from the human SK-hepl 
15 cell line by PGR amplification using the Marathon RACE. The 5' extended 
sequence of the SK-hepl hpa cDNA was assembled with the sequence of the hpa 
cDNA isolated from human placenta (SEQ ID NO:9). The assembled sequence 
Contained an open reading frame, SEQ ID NOs: 13 and 15, which encodes, as 
shown in SEQ ID NOs: 14 and 15, a polypeptide of 592 amino acids, with a 
20 calculated molecular weight of 66,407 daltons. This open reading frame was 
shown to direct the expression of catalitically active heparanase in a mammalian 
cell expression system. The expressed heparanase was detectable by anti 
heparanase antibodies in Western blot analysis. 

A panel of monochromosomal human/CHO and human/mouse somatic 

« 

25 cell hybrids was used to localize the human heparanase gene to human 
chromosome 4. The newly isolated heparanase sequence can therefore be used to 
identify a chromosome region harboring a human heparanase gene in a 
chromosome spread. 

Thus, according to the present invention there is provided a polynucleotide 

30 fragment (either DNA or RNA, either single stranded or double stranded) which 
includes a polynucleotide sequence encoding a polypeptide having heparanase 
catalytic activity. 

The term "heparanase catal3^ic activity" or its equivalent term "heparanase 
activity" both refer to a mammalian endoglycosidase hydrolyzing activity which 
35 is specific for heparan or heparan sulfate proteoglycan substrates, as opposed to 
the activity of bacterial enzymes (heparinase I, II and III) which degrade heparin 
or heparan sulfate by means of p-elimination (37). 
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In a preferred embodiment of the invention the polynucleotide fragment 
includes nucleotides 63-1691 of SEQ ID NO:9, or nucleotides 139-1869 of SEQ 
ID NO: 13, which encode the entire human heparanase enzyme. 

However, the scope of the present invention is not limited to human 
heparanase since this is the first disclosure of an open reading frame (ORF) 
encoding any mammalian heparanase. Using the hpa cDNA, parts thereof or 
synthetic oligonucleotides designed according to its sequence will enable one 
ordinarily skilled in the art to identify genomic and/or cDNA clones including 
homologous sequences from other mammalian species. 

The present invention is therefore further directed at a polynucleotide 
fragment which includes a polynucleotide sequence capable of hybridizing (base 
pairing under either stringent or permissive hybridization conditions, as for 
example described in Sambrook, J., Fritsch, E.F., Maniatis, T. (1989) Molecular 
Cloning. A Laboratory Manual. Cold Spring Harbor Laboratory Press, New 
York.) with hpa cDNA, especially with nucleotides 1-721 of SEQ ID NO:9. 

In fact, any polynucleotide sequence which encodes a polypeptide having 
heparanase activity and which shares at least 60 % homology, preferably at least 
70 % homology, more preferably at least 80 % homology, most preferably at 
least 90 % homology with SEQ ID NOs:9 or 13 is within the scope of the present 
invention. 

The polynucleotide fragment according to the present invention may 
include any part of SEQ ID NOs: 9 or 13. For example, it may include 
nucleotides 63-721 of SEQ ID NO:9, which is a novel sequence. However, it 
may include any segment of SEQ ID NOs:9 or 13 which encodes a polypeptide 
having the heparanase catalytic activity. 

When the phrase "encodes a pol)T>eptide having heparanase catalytic 
activity" is used herein and in the claims section below it refers to the ability of 
directing the sjoithesis of a polypeptide which, if so required for its activity, 
following post translational modifications, such as but not limited to, proteolysis 
(e.g., removal of a signal peptide and of a pro- or preprotein sequence), 
methionine modification, glycosylation, alkylation (e.g., methylation), 
acetylation, etc., is catalytically active in degradation of, for example, ECM and 
cell surface associated HS. 

In a preferred embodiment of the invention the polypeptide encoded by the 
polynucleotide fragment includes an amino acid sequence as set forth in SEQ ID 
NOs: 10 or 14 or a functional part thereof, i.e., a portion harboring heparanase 
catalytic activity. 
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However, any polynucleotide fragment which encodes a polypeptide 
having heparanase activity is within the scope of the present invention. 
Therefore, the polypeptide may be allelic, species and/or induced variant of the 
amino acid sequence set forth in SEQ ID NOs: 10 or 14 or functional part thereof. 

In fact, any polynucleotide sequence which encodes a polypeptide having 
heparanase activity, which shares at least 60 % homology, preferably at least 70 
% homology, more preferably at least 80 % homology, most preferably at least 
90 % homology with SEQ ID NOs: 10 or 14 is within the scope of the present 
invention. 

The invention is also directed at providing a single stranded 
pK)lynucleotide fragment which includes a polynucleotide sequence 
complementary to at least a portion of a polynucleotide strand encoding a 
polj^jeptide having heparanase catalytic activity as described above. The term 
"complementary" as used herein refers to the ability of base pairing. 

The single stranded polynucleotide fragment may be DNA or RNA or 
even include nucleotide analogs (e.g., thioated nucleotides), it may be a synthetic 
oligonucleotide or manufactured by transduced host cells, it may be of any 
desired length which still provides specific base pairing (e.g., 8 or 10, preferably 
more, nucleotides long) and it may include mismatches that do not hamper base 
pairing. 

The invention is further directed at providing a vector which includes a 
polynucleotide sequence encoding a polypeptide having heparanase catalytic 
activity. 

The vector may be of any type. It may be a phage which infects bacteria 
or a virus which infects eukaryotic cells. It may also be a plasmid, phagemid, 
cosmid, bacmid or an artificial chromosome. The polynucleotide sequence 
encoding a polypeptide having heparanase catalytic activity may include any of 
the above described polynucleotide fragments. 

The invention is further directed at providing a host cell which includes an 
exogenous polynucleotide fragment encoding a polypeptide having heparanase 
catalytic activity. 

The exogenous polynucleotide fragment may be any of the above 
described fragments. The host cell may be of any type. It may be a prokaryotic 
cell, an eukaryotic cell, a cell line, or a cell as a portion of an organism. The 
exogenous polynucleotide fragment may be permanently or transiently present in 
the cell. In other words, transduced cells obtained following stable or transient 
transfection, transformation or transduction are all within the scope of the present 
invention. The term "exogenous" as used herein refers to the fact that the 
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polynucleotide fragment is externally introduced into the cell. Therein it may be 
present in a single of any number of copies, it may be integrated into one or more 
chromosomes at any location or be present as an extrachromosomal material. 

The invention is further directed at providing a heparanase overexpression 
5 system which includes a cell overexpressing heparanase catalytic activity. The 
cell may be a host cell transiently or stably transfected or transformed with any 

suitable vector which includes a polynucleotide sequence encoding a polypeptide 
having heparanase activity and a suitable promoter and enhancer sequences to 
direct overexpression of heparanase. However, the overexpressing cell may also 

10 be a product of an insertion (e.g., via homologous recombination) of a promoter 
and/or enhancer sequence downstream to the endogenous heparanase gene of the 
expressing cell, which will direct overexpression from the endogenous gene. The 
term "overexpression" as used herein in the specification and claims below refers 
to a level of expression which is higher than a basal level of expression typically 

15 characterizing a given cell under otherwise identical conditions. 

The invention is further directed at providing a recombinant protein 
including a polypeptide having heparanase catalytic activity. 

The recombinant protein may be purified by any conventional protein 
purification procedure close to homogeneity and/or be mixed with additives. The 

20 recombinant protein may be manufactured using any of the cells described above. 
The recombinant protein may be in any form. It may be in a crystallized form, a 
dehydrated powder form or in solution. The recombinant protein may be useful 
in obtaining pure heparanase, which in turn may be useful in eliciting anti- 
heparanase antibodies, either poly or monoclonal antibodies, and as a screening 

25 active ingredient in an an ti -heparanase inhibitors or drugs screening assay or 
system. 

The invention is further directed at providing a pharmaceutical 
composition which include as an active ingredient a recombinant protein having 
heparanase catalytic activity. 

30 Formulations for topical administration may include, but are not limited 

to, lotions, ointments, gels, creams, suppositories, drops, liquids, sprays and 
powders. Conventional pharmaceutical carriers, aqueous, powder or oily bases, 
thickeners and the like may be necessary or desirable. Coated condoms, stents, 
active pads, and other medical devices may also be useful. In fact the scope of 

35 the present invention includes any medical equipment such as a medical device 
containing, as an active ingredient, a recombinant protein having heparanase 
catalytic activity. 
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Compositions for oral administration include powders or granules, 
suspensions or solutions in water or non-aqueous media, sachets, capsules or 
tablets. Thickeners, diluents, flavorings, dispersing aids, emulsifiers or binders 
may be desirable. 

5 Formulations for parenteral administration may include, but are not 

limited to, sterile aqueous solutions which may also contain buffers, diluents and 
other suitable additives. 

Dosing is dependent on severity and responsiveness of the condition to be 
treated, but will normally be one or more doses per day, with course of treatment 

10 lasting from several days to several months or until a cure is effected or a 
diminution of disease state is achieved. Persons ordinarily skilled in the art can 
easily determine optimum dosages, dosing methodologies and repetition rates. 

Further according to the present invention there is provided a method of 
identifying a chromosome region harboring a human heparanase gene in a 

15 chromosome spread, the method is executed implementing the following method 
steps, in which in a first step the chromosome spread (either interphase or 
metaphase spread) is hybridized with a tagged polynucleotyde probe encoding 
heparanase. The tag is preferably a fluorescent tag. In a second step according to 
the method the chromosome spread is washed, thereby excess of non-hybridized 

20 probe is removed. Finally, signals associated with the hybridized tagged 
polynucleotyde probe are searched for, wherein detected signals being indicative 
of a chromosome region harboring the human heparanase gene. One ordinarily 
skilled in the art would know how to use the sequences disclosed herein in 
suitable labeling reactions and how to use the tagged probes to detect, using in 

25 situ hybridization, a chromosome region harboring a human heparanase gene. 

Reference is now made to the following examples, which together with the 
above descriptions, illustrate the invention in a non-limiting fashion. 

30 EXAMPLES 

The following protocols and experimental details are referenced in the 
Examples that follow: 

35 Purification and characterization of heparanase from a human 

hepatoma cell line and human placenta: A human hepatoma cell line (Sk-hep- 
1) was chosen as a source for purification of a human tumor-derived heparanase. 
Purification was essentially as described in U.S. Pat. No. 5,362,641 to Fuks, 
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which is incorporated by reference as if fully set forth herein. Briefly, 500 liter, 
5x10' * cells were grown in suspension and the heparanase enzyme was purified 
about 240,000 fold by applying the following steps: (i) cation exchange (CM- 
Sephadex) chromatography performed at pH 6.0, 0.3-1.4 M NaCl gradient; (ii) 
cation exchange (CM-Sephadex) chromatography performed at pH 7.4 in the 
presence of 0.1% CHAPS, 0.3-1.1 M NaCl gradient; (iii) heparin-Sepharose 
chromatography performed at pH 7.4 in the presence of 0.1% CHAPS, 0.35-1.1 
M NaCl gradient; (iv) ConA-Sepharose chromatography performed at pH 6.0 in 
buffer containing 0. 1 % CHAPS and 1 M NaCl, elution with 0.25 M a-methyl 
mannoside; and (v) HPLC cation exchange (Mono-S) chromatography performed 
at pH 7.4 in the presence of 0.1 % CHAPS, 0.25-1 M NaCl gradient. 

Active fi-actions were pooled, precipitated with TCA and the precipitate 
subjected to SDS polyacrylamide gel electrophoresis and/or tryptic digestion and 
reverse phase HPLC. Tryptic peptides of the purified protein were separated by 
reverse phase HPLC (C8 column) and homogeneous peaks were subjected to 
amino acid sequence analysis. 

The purified enzyme was applied to reverse phase HPLC and subjected to 
N-terminal amino acid sequencing using the amino acid sequencer (Applied 
Biosystems). 

Cells: Cultures of bovine corneal endothelial cells (BCECs) were 
established firom steer eyes as previously described (19, 38). Stock cultures were 
maintained in DMEM (1 g glucose/liter) supplemented with 10 % newborn calf 
serum and 5 % PCS. bFGF (1 ng/ml) was added every other day during the 
phase of active cell growth (13, 14). 

Preparation of dishes coated with ECM: BCECs (second to fifth 
passage) were plated into 4- well plates at an initial density of 2 x 10^ cells/ml, 
and cultured in sulfate- fi-ee Fisher medium plus 5 % dextran T-40 for 12 days. 
Na2^5S04 (25 MCi/ml) was added on day 1 and 5 after seeding and the cultures 
were incubated with the label without medium change. The subendothelial ECM 
was exp>osed by dissolving (5 min., room temperature) the cell layer with PBS 
containing 0.5 % Triton X-100 and 20 mM NH4OH, followed by four washes 
with PBS. The ECM remained intact, fi^ee of cellular debris and firmly attached 
to the entire area of the tissue culture dish (19, 22). 

To prepare soluble sulfate labeled proteoglycans (peak I material), the 
ECM was digested with trypsin (25 pg/ml, 6 h, 37 °C \ the digest was 
concentrated by reverse dialysis and the concentrated material was applied onto a 
Sepharose 6B gel filtration column. The resulting high molecular weight 



wo 99/11798 



PCT/US98/17954 



22 

material (Kav< 0.2, peak I) was collected. More than 80 % of the labeled 
material was shown to be composed of heparan sulfate proteoglycans (11, 39). 

Heparanase activity: Cells (1 x lO^/SS-mm dish), cell lysates or 
conditioned media were incubated on top of ^^g^iabeled ECM (18 h, 37 °C) in the 
presence of 20 mM phosphate buffer (pH 6.2). Cell lysates and conditioned 
media were also incubated with sulfate labeled peak I material (10-20 |il). The 
incubation medium was collected, centrifuged (18,000 x 4 °C, 3 min.), and 
sulfate labeled material analyzed by gel filtration on a Sepharose CL-6B column 
(0.9 X 30 cm). Fractions (0.2 ml) were eluted with PBS at a flow rate of 5 ml/h 
and counted for radioactivity using Bio-fluor scintillation fluid. The excluded 
volume (Vo) was marked by blue dextran and the total included volume (Vt) by 
phenol red. The latter was shown to comigrate with free sulfate (7, 11, 23). 
Degradation fragments of HS side chains were eluted from Sepharose 6B at 0.5 < 
Kav < 0.8 (peak II) (7, 11, 23). A nearly intact HSPG released from ECM by 
trypsin - and, to a lower extent, during incubation with PBS alone - was eluted 
next to Vo (Kav < 0.2, peak I). Recoveries of labeled material applied on the 
columns ranged from 85 to 95 % in different experiments (1 1), Each experiment 
'was performed at least three times and the variation of elution positions (Kav 
values) did not exceed +/- 1 5 %. 

Cloning of hpa cDNA: cDNA clones 257548 and 260138 were obtained 
from the I.M.A.G.E Consortium (2130 Memorial Parkway SW, Hunstville, AL 
35801). The cDNAs were originally cloned in EcoKi and Notl cloning sites in 
the plasmid vector pT3T7D-Pac. Although these clones are reported to be 
somewhat different, DNA sequencing demonstrated that these clones are 
identical to one another. Marathon RACE (rapid amplification of cDNA ends) 
human placenta (poly- A) cDNA composite was a gift of Prof. Yossi Shiloh of Tel 
Aviv University. This composite is vector free, as it includes reverse transcribed 
cDNA fragments to which double, partially single stranded adapters are attached 
on both sides. The construction of the specific composite employed is described 
in reference 39a. 

Amplification of hp3 PCR fragment was performed according to the 
protocol provided by Clontech laboratories. The template used for amplification 
was a sample taken from the above composite. The primers used for 
amplification were: 

First step: 5*-primer: AP 1 : 5 -CC ATCCT AATACGACTC ACTATAGGG 
C-3', SEQ ID NO:l; 3*-primer: HPL229: 5'-GTAGTGATGCCATGTAACTGA 
ATC-3\ SEQ ID NO:2. 
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Second step: nested 5'-prinier: AP2: 5'- 

ACTCACTATAGGGCTCGAGCG GC-3*, SEQ ID NO:3; nested 3'- primer: 
HPL171: 5*-GCATCTTAGCCGTCT TTCTTCG-3*, SEQ ID NO:4. The HPL229 
and HPL171 were selected according to the sequence of the EST clones. They 
include nucleotides 933-956 and 876-897 of SEQ ID NO:9, respectively. 

PGR program was 94 - 4 min., followed by 30 cycles of 94 ""C - 40 
sec, 62 - 1 min., 72 - 2.5 min. Amplification was performed with Expand 
High Fidelity (Boehringer Mannheim). The resulting ca. 900 bp hp3 PGR 
product was digested with Bfrl and PvulL Clone 257548 (phpal) was digested 
with EcoRly followed by end filling and was then further digested with Bfrh 
Thereafter the Pvull - Bfrl fragment of the hp3 PGR product was cloned into the 
blunt end - Bfrl end of clone phpal which resulted in having the entire cDNA 
cloned in pT3T7-pac vector, designated phpal, 

DNA Sequencing: Sequence determinations were performed with vector 
specific and gene specific primers, using an automated DNA sequencer (Applied 
Biosystems, model 373A). Each nucleotide was read from at least two 
independent primers. 

Computer analysis of sequences: Database searches for sequence 
similarities were performed using the Blast network service. Sequence analysis 
and alignment of DNA and protein sequences were done using the DNA 

« 

sequence analysis software package developed by the Genetic Computer Group 
(GCG) at the University of Wisconsin, 

Rt-PCR: RNA was prepared using TRI-Reagent (Molecular research 
center Inc.) according to the manufacturer instructions. 1.25 ^ig were taken for 
reverse transcription reaction using MuMLV Reverse transcriptase (Gibco BRL) 
and Oligo (dT)i5 primer, SEQ ID NO:5, (Promega). Amplification of the 
resultant first strand cDNA was performed with Tag polymerase (Promega). The 
following primers were used: 

HPU-355: 5'-TTCGATCCCAAGAAGGAATCAAC-3', SEQ ID NO:6, 
nucleotides 372-394 in SEQ ID NO:9 or 11. 

HPL-229: 5*-GTAGTGATGCCATGTAACTGAATC-3', SEQ ID NO:7, 
nucleotides 933-956 in SEQ ID NO:9 or 1 1 . 

PCR program: 94 ""C - 4 min., followed by 30 cycles of 94 ""C - 40 sec, 62 °C - 1 
min., 72 ^C - 1 min. 

Expression of recombinant heparanase in insect cells: Cells, High Five 
and Sf21 insect cell lines were maintained as monolayer cultures in SF9001I-SFM 
medium (GibcoBRL). 
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Recombinant Baculovirus: Recombinant vims containing the hpa gene 
was constructed using the Bac to Bac system (GibcoBRL). The transfer vector 
pFastBac was digested with Sail and Notl and ligated with a 1.7 kb fragment of 
phpa2 digested with Xho\ and Notl. The resulting plasmid was designated 
pFast/2/>a2. An identical plasmid designated pFastApfl4 was prepared as a 
duplicate and both independently served for further experimentations. 
Recombinant bacmid was generated according to the instructions of the 
manufacturer with pFasXhpal, pFastApfl4 and with pFastBac. The latter served as 
a negative control. Recombinant bacmid DNAs were transfected into Sf2l insect 
cells. Five days after transfection recombinant viruses were harvested and used 
to infect High Five insect cells, 3 x 10« cells in T-25 flasks. Cells were harvested 
2 - 3 days after infection. 4 x 10« cells were centrifiiged and resuspended in a 
reaction buffer containing 20 mM phosphate citrate buffer, 50 mM NaCl. Cells 
underwent three cycles of freeze and thaw and lysates were stored at -80 "C. 
Conditioned medium was stored at 4 "C. 

Partial purification of recombinant heparanase: Partial purification of 
recombinant heparanase was performed by heparin-Sepharose column 
chromatography followed by Superdex 75 column gel filtration. Culture medium 
(150 ml) of S£21 cells infected with pFhpa4 vims was subjected to heparin- 
Sepharose chromatography. Elution of 1 ml fractions was performed with 0.35 - 
2 M NaCl gradient in presence of 0.1 % CHAPS and 1 mM DTT in 10 mM 
sodium acetate buffer, pH 5.0. A 25 ^l sample of each fraction was tested for 
heparanase activity. Heparanase activity was eluted at the range of 0.65 - 1.1 M 
NaCl (fractions 18-26, Figure 10a). 5 ^l of each fraction was subjected to 15 % 
SDS-poIyaciylamide gel electrophoresis followed by silver nitrate staining. 
Active fractions eluted from heparin-Sepharose (Figure 10a) were pooled and 
concenfrated (x 6) on YM3 cut-off membrane. 0.5 ml of the concentrated 
material was applied onto 30 ml Superdex 75 FPLC column equilibrated with 10 
mM sodium acetate buffer, pH 5.0, containing 0.8 M NaCl, 1 mM DTT and 0.1 
% CHAPS. Fractions (0.56 ml) were collected at a flow rate of 0.75 ml/min. 
Aliquots of each fraction were tested for heparanase activity and were subjected 

to SDS-polyacrylamide gel elecfrophoresis followed by silver nitrate staining 
(Figure lib). 
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EXAMPLE 1 
Cloning of the hpa gene 
Purified fraction of heparanase isolated from human hepatoma cells (SK- 
hep-1) was subjected to tryptic digestion and microsequencing. EST (Expressed 
Sequence Tag) databases were screened for homology to the back translated 
DNA sequences corresponding to the obtained peptides. Two EST sequences 
(accession Nos. N41349 and N45367) contained a DNA sequence encoding the 
peptide YGPDVGQPR (SEQ ID N0:8). These two sequences were derived from 
c ones 257548 and 260138 (I.M.A.G.E Consortium) prepared from 8 to 9 weeks 
placenta cDNA library (Soares). Both clones which were found to be identical 
contamed an insert of 1020 bp which included an open reading frame (ORF) of 
973 bp followed by a 3' untranslated region of 27 bp and a Poly A tail No 
translation start site (AUG) was identified at the 5' end of these clones. 

Cloning of the missing 5' end was performed by PGR amplification of 
DNA from a placenta Marathon RACE cDNA composite. A 900 bp fragment 
(designated hp3), partially overlapping with the identified 3' encoding EST clones 
was obtained. 

The joined cDNA fragment, 1721 bp long (SEQ ID NO:9), contained an 
open reading frame which encodes, as shown in Figure 1 and SEQ ID NO" 11 a 
polypeptide of 543 amino acids (SEQ ID NO: 10) with a calculated molecular 
weight of 61,192 daltons. The 3' end of the partial cDNA inserts contained in 
clones 257548 and 260138 started at nucleotide Q1^^ of SEQ ID NO:9 and Figure 

As fiirther shown in Figure 1, there was a single sequence discrepancy 
between the EST clones and the PGR amplified sequence, which led to an amino 
acid substitution from Tyr246 in the EST to Phe246 i„ the amplified cDNA The 
nucleotide sequence of the PGR amplified cDNA fragment was verified from two 
mdependent amplificafion products. The new gene was designated hpa 

As stated above, the 3' end of the partial cDNA inserts contained in EST 
clones 257548 and 260138 started at nucleotide 721 of hpa (SEQ ID NO-9) The 
ability of the hpa cDNA to form stable secondary stmctures, such as stem and 
loop sfructures involving nucleotide sfretches in the vicinity of position 721 was 
mvestigated using computer modeling. It was found that stable stem and loop 
structures are likely to be formed involving nucleotides 698-724 (SEQ ID NO-9) 
In addition, a high GC coment, up to 70 %, characterizes the 5' end region of the 
hpa gene, as compared to about only 40 % in the 3' region. These findings may 
explain the immature termination and therefore lack of 5* ends in the EST clones 
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To examine the ability of the hpa gene product to catalyze degradation of 
heparan sulfate in an in vitro assay the entire open reading frame was expressed 
m msect cells, using the Baculovirus expression system. Extracts of cells, 
infected with virus containing the hpa gene, demonstrated a high level of heparan 
sulfate degradation activity, while cells infected with a similar construct 
containing no hpa gene had no such activity, nor did non-infected cells. These 
results are further demonstrated in the following Examples. 

EXAMPLE 2 
Degradation of soluble ECM-derived HSPG 

Monolayer cultures of High Five cells were infected (72 h, 28 °C) with 
recombinant Bacoluvirus containing the pFast/ipa plasmid or with control vims 
containing an insert free plasmid. The cells were harvested and lysed in 
heparanase reaction buffer by three cycles of freezing and thawing. The cell 
lysates were then incubated (18 h, 37 °C) with sulfate labeled, ECM-derived 
HSPG (peak I), followed by gel filtration analysis (Sepharose 6B) of the reaction 
mixture. 

As shown in Figure 2, the substrate alone included almost entirely high 
molecular weight (Mr) material eluted next to Vq (peak I, fractions 5-20, Kav < 
0.35). A similar elution pattern was obtained when the HSPG substrate was 
incubated with lysates of cells that were infected with control virus. In contrast, 
incubation of the HSPG substrate with lysates of cells infected with the hpa 
containing virus resulted in a complete conversion of the high Mr substrate into 
low Mr labeled degradation fragments (peak II, fractions 22-35, 0.5 < Kav < 
0.75). 

Fragments eluted in peak II were shown to be degradation products of 
heparan sulfate, as they were (i) 5- to 6-fold smaller than intact heparan sulfate 
side chains (Kav approx. 0.33) released from ECM by freatment with either 
alkaline borohydride or papain; and (ii) resistant to further digestion with papain 
or chondroitinase ABC, and susceptible to deamination by nitrous acid (6, 1 1). 

Similar results (not shown) were obtained with SfZl cells. Again, 
heparanase activity was detected in cells infected with the hpa containing virus 
(pF/rpa), but not with control yirus (pF). This result was obtained with two 
mdependentiy generated recombinant viruses. Lysates of control not infected 
High Five cells failed to degrade the HSPG substrate. 

In subsequent experiments, the labeled HSPG substrate was incubated 
with medium conditioned by infected High Five or Sf2 1 cells. 
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As shown in Figures 3a-b, heparanase activity, reflected by the conversion 
of the high Mr peak I substrate into the low Mr peak II which represents HS 
degradation fragments, was found in the culture medium of cells infected with the 
pPhpal or ipVhpa^ viruses, but not with the control pFl or pF2 viruses. No 
heparanase activity was detected in the culture medium of control non-infected 
High Five or Sf21 cells. 

The medium of cells infected with the pF/jpa4 virus was passed through a 
50 kDa cut off membrane to obtain a crude estimation of the molecular weight of 
the recombinant heparanase enzyme. As demonstrated in Figure 4, all the 
enzymatic activity was retained in the upper compartment and there was no 
activity in the flow through (<50 kDa) material. This result is consistent with the 
expected molecular weight of the hpa gene product. 

In order to fiirther characterize the hpa product the inhibitory effect of 
heparin, a potent inhibitor of heparanase mediated HS degradation (40) was 
examined. 

As demonstrated in Figures 5a-b, conversion of the peak I substrate into 
peak II HS degradation fragments was completely abolished in the presence of 
heparin. 

Altogether, these results indicate that the heparanase enzyme is expressed 
in an active form by insect cells infected with Baculovirus containing the newly 
identified human hpa gene. 



EXAMPLE 3 
Degradation of HSPG in intact ECM 
Next, the ability of intact infected insect cells to degrade HS in intact, 
naturally produced ECM was investigated. For this purpose. High Five or Sf2 1' 
cells were seeded on metabolically sulfate labeled ECM followed by infection 
(48 h, 28 with either the pF/i/)a4 or control pF2 viruses. The pH of the 
medium was then adjusted to pH 6.2-6.4 and the cells further incubated with the 
labeled ECM for another 48 h at 28 «>C or 24 h at 37 °C. Sulfate labeled material 
released into the incubation medium was analyzed by gel filtration on Sepharose 
6B. 

As shown in Figures 6a-b and 7a-b, incubation of the ECM with cells 
infected with the control pF2 virus resulted in a constant release of labeled 
material that consisted almost entirely (>90%) of high Mr fragments (peak I) 
eluted with or next to Vq. It was previously shown that a proteolytic activity 
residing in the ECM itself and/or expressed by cells is responsible for release of 
the high Mr material (6). This nearly intact HSPG provides a soluble substrate 
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for subsequent degradation by heparanase, as also indicated by the relatively 
large amount of peak I material accumulating when the heparanase enzyme, is 
inhibited by heparin (6, 7, 12, Figure 9). On the other hand, incubation of the 
labeled ECM with cells infected with the pFhpa4 virus resulted in release of 60- 
70% of the ECM-associated radioactivity in the form of low Mr sulfate-labeled 
fragments (peak II, 0.5 <Kav< 0.75), regardless of whether the infected cells 
were incubated with the ECM at 28 °C or 37 °C. Control intact non-infected 
Sf21 or High Five cells failed to degrade the ECM HS side chains. 

In subsequent experiments, as demonstrated in Figures 8a-b, High Five 
and Sf21 cells were infected (96 h, 28 °C) with p¥hpa4 or control pFl viruses 
and the culture medium incubated with sulfate-labeled ECM. Low Mr HS 
degradation fragments were released from the ECM only upon incubation with 
medium conditioned by pFhpa4 infected cells. As shown in Figure 9, production 
of these fragments was abolished in the presence of heparin. No heparanase 
activity was detected in the culture medium of control, non-infected cells. These 
results mdicate that the heparanase enzyme expressed by cells infected with the 
pFhpaA virus is capable of degrading HS when complexed to other 
macromolecular constituents (i.e. fibronectin, laminin, collagen) of a naturally 
produced intact ECM, in a manner similar to that reported for highly metastatic 
tumor cells or activated cells of the immune system (6, 7). 

EXAMPLE 4 
Purification of recombinant heparanase 
The recombinant heparanase was partially purified from medium of 
pFhpaA infected Sf21 cells by Heparin-Sepharose chromatography (Figure 10a) 
followed by gel filtration of the pooled active fractions over an FPLC Superdex 
75 column (Figure 11a). A 63 kDa protein was observed, whose quantity, as 
was detected by silver stained SDS-polyacrylamide gel electrophoresis, 
correlated with heparanase activity in the relevant column fractions (Figures 1 Ob 
and lib, respectively). This protein was not detected in the culture medium of 
cells infected with the control pFl virus and was subjected to a similar 
fractionation on heparin-Sepharose (not shown). 

EXAMPLE 5 

Expression of the hpa gene in various cell types, organs and tissues 
Referring now to Figures 12a-e, RT-PCR was applied to evaluate the 
expression of the hpa gene by various cell types and tissues. For this purpose, 
total RNA was reverse transcribed and amplified. The expected 585 bp long 
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cDNA was clearly demonstrated in human kidney, placenta (8 and 1 1 weeks) and 
mole tissues, as well as in freshly isolated and short termed (1,5-48 h) cultured 
human placental cytotrophoblastic cells (Figure 1 2a), all known to express a high 
heparanase activity (41). The hpa transcript was also expressed by normal 
human neutrophils (Figure 12b). In contrast, there was no detectable expression 
of the hpa mRNA in embryonic human muscle tissue, thymus, heart and adrenal 
(Figure 12b). The hpa gene was expressed by several, but not all, human bladder 
carcinoma cell Hues (Figure 12c), SK hepatoma (SK-hep-1), ovarian carcinoma 
(OV 1063), breast carcinoma (435, 231), melanoma and megakaryocytic (DAMI, 
CHRP) human cell lines (Figures 1 2d-e). 

The above described expression pattern of the hpa transcript was 
determined to be in a very good correlation with heparanase activity levels 
determined in various tissues and cell types (not shown). 

EXAMPLE 6 
hpa homologous genes 

EST databases were screened for sequences homologous to the hpa gene. 
Three mouse ESTs were identified (accession No. Aal 77901, from mouse spleen, 
Aa067997 from mouse skin, Aa47943 from mouse embryo), assembled into a 
824 bp cDNA fragment which contains a partial open reading frame (lacking a 5' 
end) of 629 bp and a 3' untranslated region of 195 bp (SEQ ID NO: 12). As 
shown in Figure 13, the coding region is 80% similar to the 3" end of the hpa 
cDNA sequence. These ESTs are probably cDNA fragments of the mouse hpa 
homolog that encodes for the mouse heparanase. 

Searching for consensus protein domains revealed an amino terminal 
homology between the heparanase and several precursor proteins such as 
Procollagen Alpha 1 precursor, Tyrosine-protein kinase-RYK, Fibulin-1, Insulin- 
like growth factor binding protein and several others. The amino terminus is 
highly hydrophobic and contains a potential trans-membrane domain. The 
homology to known signal peptide sequences suggests that it could fimction as a 
signal peptide for protein localization. 

EXAMPLE 7 

Isolation of an extended 5' end of hpa cDNAfrom human SK-hepl cell line 



The 5' end oi hpa cDNA was isolated from human SK-hepl cell line by 
PGR amplification using the Marathon RACE (rapid amplification of cDNA 
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ends) kit (Clontech). Total RNA was prepared from SK-hepl cells using the 
TRI-Reagent (Molecular research center Inc.) according to the manufacturer 
instructions. Poly A+ RNA was isolated using the mRNA separator kit 
(Clonetech). 

5 The Marahton RACE SK-hepl cDNA composite was constructed 

according to the manufacturer recommendations. First round of amplification 
was performed using an adaptor specific primer API: 5'-CCATCCTAATACG 
ACTCACTATAGGGC-3', SEQ ID NO:l, and a hpa specific antisense primer 
hpl-629: 5'-CCCCAGGAGCAGCAGCATCAG-3', SEQ ID N0:17, 

10 corresponding to nucleotides 119-99 of SEQ ID NO:9. The resulting PGR 
product was subjected to a second round of amplification using an adaptor 
specific nested primer AP2: 5'-ACTCACTATAGGGCTCGAGCGGC-3', SEQ 
ID NO:3, and a hpa specific antisense nested primer hpl-666 5- 
AGGCTTCGAGCGCAGCAGCAT-3', SEQ ID NO: 18, corresponding to 

15 nucleotides 83-63 of SEQ ID NO:9. The PGR program was as follows: a hot 
start of 94 °C for 1 minute, followed by 30 cycles of 90 °C - 30 seconds, 68 °C - 
4 minutes. The resulting 300 bp DNA fi-agment was extracted fi-om an agarose 
gel and cloned into the vector pGEM-T Easy (Promega). The resulting 
recombinant plasmid was designated pHPSKl . 

20 The nucleotide sequence of the pHPSKI insert was determined and it was 

found to contain 62 nucleotides of the 5' end of the placenta hpa cDNA (SEQ ID 
NO:9) and additional 178 nucleotides upstream, the first 178 nucleotides of SEQ 
ID NOs: 13 and 15. 

A single nucleotide discrepancy was identified between the SK-hepl 

25 cDNA and the placenta cDNA. The "T" derivative at position 9 of the placenta 
cDNA (SEQ ID NO:9), is replaced by a "C" derivative at the corresponding 
position 187 of the SK-hepl cDNA (SEQ ID NO: 13). 

The discrepancy is likely to be due to a mutation at the 5' end of the 
placenta cDNA clone as confirmed by sequence analysis of sevsral additional 

30 cDNA clones isolated from placenta, which like the SK-hepl cDNA contained C 
at position 9 of SEQ ID NO:9. 

The 5' extended sequence of the SK-hepl hpa cDNA was assembled with 
the sequence of the hpa cDNA isolated from human placenta (SEQ ID NO:9). 
The assembled sequence contained an open reading frame which encodes, as 

35 shown in SEQ ID NOs: 14 and 15, a polypeptide of 592 amino acids with a 
calculated molecular weight of 66,407 daltons. The open reading frame is 
flanked by 93 bp 5' untranslated region (UTR). 
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EXAMPLE 8 

Isolation of the upstream genomic region of the hpa gene 

The upstream region of the hpa gene was isolated using the Genome 
Walker kit (Clontech) according to the manufacturer recommendations. The kit 
includes five human genomic DNA samples each digested with a different 
restriction endonuclease creating blunt ends: £"coRV, Seal, Oral, Pvull and Sspl. 

The blunt ended DNA fragments are ligated to partially single stranded 
adaptors. The Genomic DNA samples were subjected to PGR amplification 
using the adaptor specific primer and a gene specific primer. Amplification was 
performed with Expand High Fidelity (Boehringer Mannheim). 

A first round of amplification was performed using the apl primer: 5'-G 
TAATACGACTCACTATAGGGC-3', SEQ ID NO: 19, and the hpa specific 
antisense primer hpl-666: 5'-AGGCTTCGAGCGCAGCAGCAT-3', SEQ ID 
NO: 18, corresponding to nucleotides 83 - 63 of SEQ ID NO:9. The PGR 
program was as follows: a hot start of 94 °C - 3 minutes, followed by 36 cycles of 
94 °G - 40 seconds, 67 °G - 4 minutes. 

The PGR products of the first amplification were diluted 1:50. One |il of 
the diluted sample was used as a template for a second amplification using a 
nested adaptor specific primer ap2: 5'-AGTATAGGGGAGGGGTGGT-3', SEQ 
ID NO:20, and a hpa specific antisense primer hpl-690, 5'-GTTGGGGTGAGG 
TGGGTGGTG-3', SEQ ID NO:21, corresponding to nucleotides 62-42 of SEQ 
ID NO:9. The resulting amplification products were analyzed using agarose gel 
electrophoresis. Five different PGR products were obtained fi-om the five 
amplification reactions. A DNA fi-agment of approximately 750 bp which was 
obtained fi-om the Sspl digested DNA sample was gel extracted. The purified 
Augment was ligated into the plasmid vector pGEM-T Easy (Promega). The 
resulting recombinant plasmid was designated pGHP6905 and the nucleotide 
sequence of the hpa insert was determined. 

A partial sequence of 594 nucleotides is shown in SEQ ID NO: 16. The 
last nucleotide in SEQ ID NO: 13 corresponds to nucleotide 93 in SEQ ID: 13. 
The DNA sequence in SEQ ID NO: 1 6 contains the 5' region of the hpa cDNA 
and 501 nucleotides of the genomic upstream region which are predicted to 
contain the promoter region of the hpa gene. 
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EXAMPLE 9 

Expression of the 592 amino acids HPA polypeptide in a human 293 cell line 
The 592 amino acids open reading frame (SEQ ID NOs:13 and 15) was 
constructed by ligation of the 11 0 bp corresponding to the 5' end of the SK-hepl 
kpa cDNA with the placenta cDNA. More specifically the Marathon RACE - 
PGR amplification product of the placenta hpa DNA was digested with Sad and 
an approximately 1 kb fragment was ligated into a S'acl-digested pGHP6905 
plasmid. The resulting plasmid was digested with Earl and Aatll. The Earl 
sticky ends were blunted and an approximately 280 bp Earl/hlunt-Aaill fragment 
was isolated. This fragment was ligated with pFasthpa digested with EcoKl 
which was blunt ended using Klenow fragment and fiirther digested with Aatll. 
The resulting plasmid contained a 1 827 bp insert which includes an open reading 
frame of 1776 bp, 31 bp of 3' UTR and 21 bp of 5' UTR. This plasmid was 
designated pFastLhpa. 

A mammalian expression vector was constmcted to drive the expression 
of the 592 amino acids heparanase polypeptide in human cells. The hpa cDNA 
was excised prom pFastLhpa with BssHIl and NotL The resulting 1850 bp 
BssUU-Notl firagirient was ligated to a mammalian expression vector pSI 
(Promega) digested with MIul arid Notl. The resulting recombinant plasmid, 
pSI/j/><2Met2 was transfected into a human 293 embryonic kidney cell line. 

Transient expression of the 592 amino-acids heparanase was examined by 
western blot analysis and the enzymatic activity was tested using the gel shift 
assay. Both these procedures are described in length in U.S. Pat. application No. 
09/071,739, filed May 1, 1998, which is incorporated by reference as if fiiUy set 
forth herein. Cells were harvested 3 days following transfection. Harvested cells 
were re-suspended in lysis buffer containing 150 mM NaCl, 50 mM Tris pH 7.5, 
1% Triton X-100, 1 mM PMSF and protease inhibitor cocktail (Boehringer 
Mannheim). 40 ^g protein extract samples were used for separation on a SDS- 
PAGE. Proteins were transferred onto a PVDF Hybond-P membrane 
(Amersham). The membrane was incubated with an affinity purified polyclonal 
anti heparanase antibody, as described in U.S. Pat application No. 09/071,739. 
A major band of approximately 50 kDa was observed in the transfected cells as 
well as a minor band of approximately 65 kDa. A similar pattern was observed 
in extracts of cells transfected with the pShpa as demonstrated in U.S. PaL 
application No. 09/071,739. These two bands probably represent two forms of 
the recombinant heparanase protein produced by the transfected cells. The 65 
kDa protein probably represents a heparanase precursor, while the 50 kDa protein 
is suggested herein to be the processed or mature form. 
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The catalytic activity of the recombinant protein expressed in the 
pShpaUetl transfected cells was tested by gel shift assay. Cell extracts of 
transfected and of mock transfected cells were incubated overnight with heparin 
(6 ng in each reaction) at 37 °C, in the presence of 20 mM phosphate citrate 
buffer pH 5.4, 1 mM CaCl2, 1 mM DTT and 50 mM NaCl. Reaction mixtures 
were then separated on a 10 % polyacrylamide gel. The catalytic activity of the 
recombinant heparanase was clearly demonstrated by a faster migration of the 
heparin molecules incubated with the transfected cell extract as compared to the 
control. Faster migration indicates the disappearance of high molecular weight 
heparin molecules and the generation of low molecular weight degradation 
products. 



EXAMPLE 10 
Chromosomal localization of the hpa gene 

Chromosomal mapping of the hpa gene was performed utilizing a panel of 
monochromosomal human/CHO and human/mouse somatic cell hybrids, 
obtained from the UK HGMP Resource Center (Cambridge, England). 

40 ng of each of the somatic cell hybrid DNA samples were subjected to 
PCR amplification using the hpa primers: hpu565 5'-AGCTCTGTAGATGTGC 
TATACAC-3', SEQ ID NO:22, corresponding to nucleotides 564-586 of SEQ ID 
NO:9 and an antisense primer hpllVl 5'-GCATCTTAGCCGtCTTTCTTCG-3', 
SEQ ID NO:23, corresponding to nucleotides 897-876 of SEQ ID NO:9. 

The PCR program was as follows: a hot start of 94 °C - 3 minutes, 
followed by 7 cycles of 94 °C - 45 seconds, 66 °C - 1 minute, 68 °C - 5 minutes, 
followed by 30 cycles of 94 °C - 45 seconds, 62 °C - 1 minute, 68 °C - 5 
minutes, and a 10 minutes final extension at 72 °C. 

The reactions were performed with Expand long PCR (Boehringer 
Mannheim). The resulting amplification products were analyzed using agarose 
gel electrophoresis. As demonstrated in Figure 14, a single band of 
approximately 2.8 Kb was obtained from chromosome 4, as well as from the 
control human genomic DNA. A 2.8 kb amplification product is expected based 
on amplification of the genomic hpa clone (data not shown). No amplification 
products were obtained neither in the confrol DNA samples of hamster and 
mouse nor in somatic hybrids of other human chromosome. 

Although the invention has been described in conjunction with specific 
embodiments thereof, it is evident that many alternatives, modifications and 
variations will be apparent to those skilled in the art. Accordingly, it is intended 
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to embrace all such alternatives, modifications and variations diat fall within the 
spirit and broad scope of the appended claims. 
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WHAT IS CLAIMED IS: 



1. A polynucleotide fragment comprising a polynucleotide sequence 
encoding a polypeptide having heparanase catalytic activity. 

2. The polynucleotide fragment of claim 1, wherein said 
polynucleotide sequence includes nucleotides 63-1691 of SEQ ID NO:9, or 
nucleotides 139-1869 of SEQ ID NO:13. 

3. The polynucleotide fragment of claim 1, wherein said 
polynucleotide sequence includes nucleotides 63-721 of SEQ ID NO:9. 

4. The polynucleotide fragment of claim 1, wherein said 
polynucleotide is as set forth in SEQ ID NOs:9 or 13. 

5. The polynucleotide fragment of claim 1, wherein said 
polynucleotide sequence includes a segment of SEQ ID NOs:9 or 13, said 
segment encodes said polypeptide having said heparanase catalytic activity. 

6. The polynucleotide fragment of claim 1, wherein said polypeptide 
includes an amino acid sequence as set forth in SEQ ID NOs:10 or 14. 

7. The polynucleotide fragment of claim 1, wherein said polypeptide 
includes a segment of SEQ ID NOs:10 or 14, said segment harbors said 
heparanase catalytic activity. 

8. The polynucleotide fragment of claim 1, wherem said 
polynucleotide sequence is selected from the group consisting of double stranded 
DNA, single stranded DNA and RNA. 

9. A single stranded polynucleotide fragment comprising a 
polynucleotide sequence corhplementary to at least a portion of a polynucleotide 
strand encoding a polypeptide having heparanase catalytic activity. 

10. The polynucleotide fragment of claim 9, wherein said 
polynucleotide sequence includes at least a portion of SEQ ID NOs:9 or 13. 
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11. A vector comprising a polynucleotide sequence encoding a 
polypeptide having heparanase catalytic activity. 

12. The vector of claim 11, wherein said polynucleotide sequence 
includes nucleotides 63-1691 of SEQ ID N0:9, or nucleotides 139-1869 of SEQ 
IDNO:13. 

13. The vector of claim 11, wherein said polynucleotide sequence 
includes nucleotides 63-721 of SEQ ID N0:9. 

14. The vector of claim 1 1, wherein said polynucleotide sequence is as 
set forth in SEQ ID NOs:9 or 13. 

15. The vector of claim II, wherein said polynucleotide sequence 
includes a segment of SEQ ID NOs:9 or 13, said segment encodes said 
polypeptide having said heparanase catal3^ic activity. 

1 6. The vector of claim 1 1 , wherein said polypeptide includes an amino 
acid sequence as set forth in SEQ ID NOs::10 or 14. 

17. The vector of claim 11, wherein said polypeptide includes a 
segment of SEQ ID NOs:10 or 14, said segment harbors said heparanase catalytic 
activity. 

18. The vector of claim 11, wherein said polynucleotide sequence is 
selected from the group consisting of double stranded DNA, single stranded 
DNA and RNA. 

19. A host cell comprising an exogenous polynucleotide fragment 
including a polynucleotide sequence encoding a polypeptide having heparanase 
catalytic activity. 

20. The host ceir of claim 19, wherein said polynucleotide sequence 
includes nucleotides 63-1691 of SEQ ID NO:9, or nucleotides 139-1869 of SEQ 
ID NO:13. 

21. The host cell of claim 19, wherein said polynucleotide sequence 
includes nucleotides 63-721 of SEQ ID NO:9. 
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22. The host cell of claim 19, wherein said polynucleotide sequence is 
as set forth in SEQ ID NOs:9 or 13. 

23. The host cell of claim 19, wherein said polynucleotide sequence 
includes a segment of SEQ ID N0s:9 or 13, said segment encodes said 
polypeptide having said heparanase catalytic activity. 

24. The host cell of claim 19, wherein said polypeptide includes an 
amino acid sequence as set forth in SEQ ID NOsilO or 14. 

25. The host cell of claim 19, wherein said polypeptide includes a 
segment of SEQ ID NOs:10 or 14, said segment harbors said heparanase catalytic 
activity. 

26. The host cell of claim 19, wherein said polynucleotide sequence is 
selected from the group consisting of double stranded DNA, single stranded 
DNA and RNA. 

27. A host cell expressing a recombinant heparanase. 

28. A recombinant protein comprising a polypeptide having heparanase 
catalytic activity. 

29. The recombinant protein of claim 28, wherein said polypeptide 
includes a segment of SEQ ID NOs: 1 0 or 1 4. 

30. A polynucleotide fragment comprising a polynucleotide sequence 
capable of hybridizing with nucleotides 1-721 of SEQ ID NO:9. 

31. A polynucleotide sequence as set forth in SEQ ID NOs:9 or 1 3. 

32. A polynucleotide sequence homologous to SEQ ID NOs:9 or 13. 
33- An amino acid sequence as set forth in SEQ ID NOs: 10 or 14. 
34. An amino acid sequence homologous to SEQ ID NOs: 10 or 14. 
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35. A pharmaceutical composition comprising as an active ingredient a 
recombinant protein having heparanase catalytic activity. 



36. A heparanase overexpression system comprising a cell 
overexpressing heparanase catalytic activity. 



37. A modulator of heparin-binding growth factors, cellular responses 
to heparin-binding growth factors and cytokines, cell interaction with plasma 
lipoproteins, cellular susceptibility to viral, protozoa and bacterial infections or 
dismtegration of neurodegenerative plaques comprising as an active ingredient a 
recombinant protein having heparanase catalytic activity. 



38. A medical equipment comprising a medical device containing, as 
an active ingredient, a recombinant protein having heparanase catalytic activity. 



39- The vector of claim 1 1 , wherein said vector is a baculovirus vector. 



40. The host cell of claim 19, wherein said cell is an insect cell 



41 . The host cell of claim 27, wherein said cell is an insect cell 
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42. A method of identifying a chromosome region harboring a human 
heparanase gene in a chromosome spread comprising the steps of: 

(a) hybridizing the chromosome spread with a tagged polynucleotyde 
probe encoding heparanase; 

(b) washing the chromosome spread, thereby removing excess of non- 
hybridized probe; and 

(c) searching for signals associated with said hybridized tagged 
polynucleotyde probe, wherein detected signals being indicative of 
a chromosome region harboring a human heparanase gene. 
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1 CTAGAGCTTTCGACTCTCCGCTGCGCGGCAGCTGGCGGGGGGAGCAGCCAGGTGAGCCCA 

61 AG.=.?GCTGCT'SCC-CTCGAAGCCTGCGCTGCCGCCGCCGCTGATGCTGCTGCTCC7GG-SGC 
MLLRSKPALPPPLMLLLLGP 

121 CGC7C-:-r-rcrCCTCTCCCCTGGCGCCCTGCCCCGACCTGCGCAAGCACAGGAC3TCG?GG 
LGPLS PGAL PRPAOAQDVVD 

181 ACCTG3ACTTCT7CACCCAGGAGCCGCTGCACCTGGTGAGCCCCTCGTTCCTGTCC-3TCA 
LDFFTQEPLHLVSPS TLSVT 

2 41 Ca'.?73AC3CCA.ACCTGGCCACGGACCCGCSGTTCCTCATCCTCCTGGGTTCTCC.---,-.GC 
IDANLATDPRFLILLGSPKL 

301 TTC3?.-.CCT?GGCC,i:=A3GCTTGTCTCCTGCGTACCTGAGGTTTGGTGGCACCP^i.3.-.CAG 
RTLARGLSPAYLRFGGTKTD 

361 ACT?CZ:PA:7T?C-3A7CCC.!iAG.^AGGAATCA^CCTTTGAAGAGAGAAGTTACTC-C-C>A': 
FLIFDPKKESTFEERSYWQS 

4 21 CTw-.-70C.-ACC.^i^GATATTTGC AAATATGGATCCATCCCTCCTGATG?GC-AC<-A3.^^^^ 
QVN'QDICKYGSI PPDVEHTKL 

481 TACC-3":: aC-AKTGGC 3CTAC C AGGAGC AATTGC TAG TCCGAG tAC AC TACC AG ^J^J^.-J^^? 

RIEWPYQEQLLLREHYQKKF 

541 TCAi:-A-,CA:-CACC7ACTCAAGA-AGCTCTGrAC^TGTGCTATACAX7TTTGC.::.^^^ I-CT 
K.N'STYSRSSVDVLYTFANCS 

601 CAC-:-.-.:t :-:^-.C77^A7CT7TG^CCTAAATGCGTTA.TTAJii;A.-rAGCAGAr':^■:-CAG: 

GLOLI FGLNALLRTADLQWN 

o 6 2 AC A37:c 7.-A7 3C 7 CA-37TGC TCC T GGAC TAG TGC TCTTCCAAGGGG 7.-.7.--A7 A7 7 7C T7 
SSN'A. QLLLDYCS S KGYKISW 

7 21 GG :-.-.-.C : A3 3 CAAT 3.-.ACC TAAC AG TT T C C 7TA.AGAAGGCTGAT.A7 777 C A.7 CA.A7 3 3 G7 
Z h G ii' E P N S F L KK-A D I F I KG S 

(T) 

7 81 CGC AC- 77 A'3-3AG.-_AGAT T AT ATTC.AA.T TGC AT AAAC T TC TAAG AAAG T CCA.CC T TC.->,AA 

V i I r Y I o L E X L L R :< 5 T r V 

(T) 

8 41 A7 3 7 C 7 AT G 3 TC C TG AT GT T GG T C AG C CTCGAAG AAA.GACG 3 C 7 A-A3 A7 GC 7 3A 

A?:i YG PDVGQ PRRKTAKMLK 

901 ---3A3:: :777GA_AGGCrGGTGGAG.AAGTGATTGATTCA.GTT.ACATG3CA7CAC7A77A7T 
SFLKAGGEVIDSVTWF. 5YYL 

951 TG A-.7 3 3AC 3 3AC TGC TAC C AGGGAAG A? T T 7 C TAJ^AC C C TG AT GTA7 7 G 3 AC A77 7 ?7A 
KGRTATREDFLNPDVLDiri 

1021 TT77;-C7G7GCA_A;^AAG?TTTCCAGGTGGTTGAGAGCACCAGGCCTG3CA-=GA.AG37CT 
SSVQKVFQVVESTRPGKKVW 

1 0 e 1 GG77.-.:-3A3A_-A.C AAGCTC TGCAT ATGGAGGC GGAGCGCCC TTGC T.ATCCGAC ACC 7TTG 
LGET SSAYGGGAPLLSDTFA 

1141 CA3C:33C777ATG7GGCTGGATAAATTGGGCCTGTCAGCCCGA_ATG-:^3AA7AGA,A37GG 
AGFMWLDK LGLSARMGIEVV 

12 01 TG A7 3 A33 C.-AG T ATT C TT TGGAGC AGGAA.AC TAC C ATT TAG TG GA.TG.-_A-AC 7 TC GA.TC 

MRQV F FG AG N Y H LVDEN FD ? 

1261 C 7 7 7.-.C C T GATTAT TGGC T ATC TC TTC TG T TC AAjGAAAT TGGTG GGC ACC AAG G TG 7 TAA 
LPDYWLSLLFKKLVGTKVLM 

13 21 T33CA-.3CG7GCA-AGGTTCAAAGASAAGGAAGCTTCGA-GTATACCTTCATTGCACA.AACA 

ASVQGS krrklrvy lhctnt 

1381 CTC-AC AA7C C AAGG7ATAAAGAAGGAGATTTAAC TCTGTATGCC ATAAACC TC C A7AACG 
DNPRYKEGDLTLYAINLENV 

14 41 TCAC:AA37ACTTGCGGTTACCCTATCCTTTTTCTAACAAGCAAGTGGA7-AAA7ACCTTC 

tkylrlpy pfsmkqvdkyll 

1501 T£AG=CCTTTGGGACCTCATGGATTACTTTCCAAATCTGTCCAACTCAATGGTCTAACTC 
P-PLGPHGLLSKSVQLKGLTL 

1 S S 1 T.:kAA::-A7C-37GGATGATCAAACCTTGCCACCTTTAATGGAAAAACCTCTCCGGCCA.GGAA 
Kf-*. VDDQTL PP LM£K PLRPGS 

152 J G7T--::7:-r-GC77GCCAC-CTTTCTCATA7AGT7TTTTTGTGATA.AGA^.-.7C-CC.-A.---77G 
SLGLPAFS YS FFVl ?- NAKVA 
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mouse CTGGCAAGAAGGTCTGGTTGGGAGAGACGAGCTCAGCTTACGGTGGCGGT 5a 

I I 1 I I I I M I I I 1 i 1 I I I I Milt II Mill II II II II I I I 

huma n CTGGCAAGAAGGTCTGGTTAGGAGAAACAAGCTCTGCATATGGAGGCGGA 1115 

• • • • » 
mous e GCACCCTTGCTGTCCAACACCTTTGCAGCTGGCTTTATGTGGCTGGATAA 1 00 

II f I 11 t II I III I I I II I I II I I I I I I I II II II I II I I I I I I III 
human GCGCCCTTGCTATCCGACACCTTTGCAGCTGGCTTTATGTGGCTGGATAA 1165 

■ • , * • • 

mouse ATTGGGCCTGTCAGCCCAGATGGGCATAGAAGTCGTGATGAGGCAGGTGT 150 

II M i I t I II I 1 II M I I I II I i II I I I I 1 I I I t t I M I I t II I 
human ATTGGGCCTGTCAGCCCGAATGGGAATAGAAGTGGTGATGAGGCAAGTAT 1215 

mouse TCTTCGGAGCAGGCAACTACCACTTAGTGGATGAAAACTTTGAGCCTTTA 2 00 
MM I M II I II II I II II I M t II M M I II M II I 11 I I M II 

human TCTTTGGAGCAGGAAACTACCATTTAGTGGATGAAAACTTCGATCCTTTA 12 65 

• • * « . 
mous e CCTGl^TTACTGGCTCTCTCTTCTGTTCAAGAAACTGGTAGGTCCCAGGGT 2 50 

It 1 M I II Mill II I II M I M II M II M I I It II III Mi 
human CCTGATTATTGGCTATCTCTTCTGTTCAAGAAATTGGTGGGCACCAAGGT 1315 

mouse GTTACTGTCAAGAGTGAAAGGCCCAGACAGGAGCAAACTCCGAGTGTATC 300 

MM II MM It I MM M I 11 II M 11 II II I II t 

human GTTAATGGCAAGCGTGCAAGGTTCAAAGAGAAGGAAGCTTCGAGTATACC- 1365 
mouse TCCACTGCACTAACGTCTATCACCCACGATATCAGGAAGGAGATCTAACT 350 

I M M I II M I I I I M I II I I II II II M I II I M 

human TTCATTGCACAi^.2^CACTGACAATCCAAGGTATAAAGAAGGAGATTTAACT 1415 

• - " • 

mouse CtGTATGTCCTGAAiCCTCCATAATGTCACCAAGCACTTGAAGGTACCGCC 4 00 

M 1 M II I I II M M It I M M II I II I r I M II [ MM 
human CTGTATGCCATAA.A.CCTCCATAACGTCACCAAGTACTTGCGGTTACCCTA 1 4 65 

• • • • . 
mous e TCCGTTGTTCAGGAAACCAGTGGATACGTACCTTCTGAAGCCTTCGGGGC 4 50 

I M II I I II i I I I Ml M I It M M I I M M I I I t 

human TCCTTTTTCTAACAAGCAAGTGGATAAATACCTTCTAAGACCTTTGGGAC 1515 

• • « . 
mouse CGGATGGATTACTTTCCAAATCTGTCCAACTGAACGGTCAAATTCTGAAG 500 

I I M II M II II M M II II I II II II I I M M M M III Ml 

human CTCATGGATTACTTTCCAAATCTGTCCAACTCAATGGTCTAACTCTAAAG 15 65 

• • - . 
mous e ATGGTGGATa^GCAGACCCTGCCAGCTTTGACAGAAAAACGTCTCCCCGC 550 

II M I It M M I I M I Mill MM I I i I It M M M M I 

human ATGGTGGATGATCAAACCTTGCCACCTTTAATGGAAAAACCTCTCCGGCC 1615 

• • • " • 

mouse AGGAAGTGCACTAAGCCTGCCTGCCTTTTCCTATGGTTTTTTTGTCATAA 600 
II It M i MM II MM II II 11 III II M II II M MM 

human AGGAAGTTCACTGGGCTTGCCAGCTTTCTCATATAGTTTTTTTGTGATAA 1665 

* • • 

mouse GAAATGCCAAAATCGCTGCTTGTATATGAAAATAAAA 637 

I M M M II M I M II M II II II II I M M II 
human GAAATGCCAAAGTTGCTGCTTGCA.TCTGAAAATAAAA 17 02 
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SEQUENCE LISTING 

(1) GENERAL INFORKATIOH: 

(j) APPLICANT: Iris Pecker, Israel Vlodavsky and Elena 

f einstein 

(ii) TITLE OF INVENTION: POLYNUCLEOTIDE ENCODING A POLYPEPTIDE 

HAVING HEPARANASE ACTIVITY AND EXPRESSION OF 
SAME IN TRANSDUCED CELLS 

(iii) NUMBER OF SEQUENCES: 23 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Mark M. Friedman c/o Robert Sheinbein 

(B) STREET: 29A0 Birchtree lane 

(C) CITY: Silver Spring 
<D) STATE: Maryland 

<E) COUNTRY: United States of America 

(F) ZIP: 20906 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: megabyte, 3.5" microdisk 

(B) COMPUTER: Twinhead* S I imnote-890TX 

(C) OPERATING SYSTEM: MS DOS version 6.2, 

Windows version 3.11 
(0) SOFTWARE: Word for Windows version 2.0 convened to 

an ASCI f i le 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/922,170 

(B) FILING DATE: 2 SEP 1997 
(VI ii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Friecinam, Mark M. 
CB) REGISTRATION NUMBER: 33,883 

(C) REFERENCE/DOCKET NUMBER: 910/1 
(ix) TELECOMMUNICATION INFORMATION: 

(AJ TELEPHONE: 972-3-5625553 

(B) TELEFAX: 972-3-5625554 
CO TELEX: 

(2) INFCHMATION FOR SEO ID N0:1: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
CD) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 
CCATCCTAAT ACGACTCACT ATAGGGC 27 

(2) INFORMATION FOR SEO ID N0i2: 

( i ) SEQUENCE CHARACTER I ST 1 CS : 

CA) LENGTH: 24 

CB) TYPE: nucleic acid 
CO STRANDEDNESS: single 

CD) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID N0:2: 
GTAGTGATGC CATGTAACTG AATC 24 

(2) INFORMATION FOR SEQ ID N0:3: 

(i) SEQUENCE CHARACTERISTICS: 

CA) LENGTH: 23 

CB) TYPE: nucleic acid 
(C) STRANDEDNESS: single 

CO) TOPOLOGY; linear 

(xi) SEQUENCE DESCRIPTION: SEO ID N0:3: 
ACTCACTATA GGGCTCGAGC GGC 23 

(2) INFORMATION FOR SEO ID N0:4: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 

CB) TYPE: nucleic acid 

CO STRANDEDNESS: single 

CD) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SFO ID NO:4: 

GCATCTTAGC CGTCTTTCTT CG 22 

(2) IWFORMAIIOM ?0R StO 10 HO: 5: 
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II 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTIOH: SEC ID N0:5: 
TTTTTTTTTT TTTTT 15 

(2) INFORMATION FOR SEQ ID N0:6: 

{ i ) SEQUENCE CHARACTER I ST I CS : 

<A> LENGTH: 23 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:6: 
TTCGATCCCA AGAAGGAATC AAC 23 

(2) INFORMATION FOR SEQ ID N0:7: 

(i) SEQUENCE CHARACTERISTICS: 

<A> LENGTH: 24 

(B> TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID M0:7: 
GTAGTGATGC CATGTAACTG AATC 2A 

(2) INFORMATION FOR SEQ ID N0:8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NOifi: 
Tyr Gly Pro Asp Val Gly Gin Pro Arg 

5 9 

(2) " INFORMATION FOR SEQ ID N0:9: 

< i ) SEQUENCE CHARACTER I ST I CS : 

(A) LENGTH: 1721 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
CD) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:9: 
CTAGAGCTTT CGACTCTCCG CTGCGCGGCA GCTGGCGGGG GGAGCAGCCA GGTGAGCCCA 60 
AGATGCTGCT GCGCTCGAAG CCTGCGCTGC CGCCGCCGCT GATGCTGCTG CTCCTGGGGC 120 
CGCTGGGTCC CCTCTCCCCT GGCGCCCTGC CCCGACCTGC GCAAGCACAG GACGTCGTGG 180 
ACCTGGACTT cTTCACCCAG GAGCCGCTGC ACCTGGTGAG CCCCTCGTTC CTGTCCGTCA 2A0 
CCATTGACGC CAACCTGGCC ACGGACCCGC GGTTCCTCAT CCTCCTGGGT TCTCCAAAGC 300 
TTCGTACCTT GGCCAGAGCC TTGTCTCCTG CGTACCTGAG GTTTGGTGGC ACCAAGACAG 360 
ACTTCCTAAT TTTCGATCCC AAGAAGGAAT CAACCTTTGA AGAGAGAAGT TACT6GCAAT A20 
CTCAAGTCAA CCAGGATATT TGCAAATATG GATCCATCCC TCCT6ATGTG GAGGAGAAGT 480 
TACGGTTGGA ATGGCCCTAC CAGGAGCAAT TGCTACTCCG AGAACACTAC CAGAAAAAGT 540 
TCAAGAACAG CACCTACTCA AGAAGCTCTG TAGATGT6CT ATACACTTTT GCAAACTGCT 600 
CAGGACTGGA CTTGATCTTT GCCCTAAATG CGTTATTAAG AACAGCAGAT TTGCAGTGGA 660 
ACAGTTCTAA TGCTCAGTTG CTCCTGGACT ACTGCTCTTC CAAGGGGTAT AACATTTCTT 720 
GGGAACTAGG CAATGAACCT AACAGTTTCC TTAA6AAGGC TGATATTTTC ATCAATGGGT 780 
CGCAGTTAGG AGAAGATTAT ATTCAATTGC ATAAACTTCT AAGAAAGTCC ACCTTCAAAA 840 
ATGCAAAACT CTATGGTCCT GATGTTCCTC AGCCTCGAAG AAAGACGGCT AAGATGCTGA 900 
AGAGCTTCCT GAAGGCTGGT G6AGAAGTGA TTGATTCAGT TACATGGCAT CACTACTATT 960 
TGAATGGACG GACTGCTACC AGGGAAGATT TTCTAAACCC TGATGTATTG GACATTTTTA 1020 
TTTCATCTGT GCAAAAAGTT TTCCAGGTGG TTGAGAGCAC CAGGCCTGGC AAGAAGGTCT 1080 
GGTTAGGAGA AACAAGCTCT GCATATGGAG GCGGAGCGCC CTTGCTATCC GACACCTTTG 1140 
CAGCTGGCTT TATGTCGCTG GATAAATTGG GCCTGTCAGC CCGAATGGGA ATAGAAGTGG 1200 
TGATGAGGCA AGTATTCTTT GGAGCAGGAA ACTACCATTT AGTGGATGAA AACTTCGATC 1260 
CTTTACCTGA TTATTGGCTA TCTCTTCTGT TCAAGAAATT GGTGGGCACC AAGGTGTTAA 1320 
TGGCAAGCGT GCAAGGTTCA AAGAGAAGGA AGCTTCGAGT ATACCTTCAT TGCACAAACA 1380 
CTGACAATCC AAGGTATAAA GAAGGAGATT TAACTCTGTA TGCCATAAAC CTCCATAACG 1440 
TCACCAAGTA CTTGCGGTTA CCCTATCCTT TTTCTAACAA GCAAGTGGAT AAATACCTTC 1500 
TAAGACCTTT CGGACCTCAT CGATTACTTT CCAAATCTGT CCAACTCAAT GGTCTAACTC 560 
TAAAGATGGT GGATGATCAA ACCTTCCCAC CTTTAATGGA AAAACCTCTC cggccaggaa 620 
gttcactggg CTTGCCAGCT TTCTCATATA ctttttttgi GATAAGAAAT GCCAAAGTTG 680 
CIGCTTGCAT CTGAAAATAA AATATACTAG TCCTGACACT G 1'^^ 

(2) iwfORMATIGN FOR SEQ ID K0:10: 

(i) SEOUEKCF CHARACTERISTICS: 
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III 

(A) LENGTH: 543 

(B) TYPE: amino acid 

(C) STRAHOEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:10: 
Het Leu Leu Arg Ser Lys Pro Ala Leu Pro Pro Pro Leu Met Leu Leu 

5 10 15 

Leu Leu Gly Pro Leu Gly Pro Leu Ser Pro Gly Ala Leu Pro Arg Pro 
20 25 30 

Ala Gin Ala Gin Asp Val Val Asp Leu Asp Phe Phe Thr Gin Glu Pro 
35 40 45 

Leu His Leu Val Ser Pro Ser Phe Leu Ser Val Thr lie Asp Ala Asn 
50 55 60 

Leu Ala Thr Asp Pro Arg Phe Leu He Leu Leu Gly Ser Pro Lys Leu 
65 70 75 80 

Arg Thr Leu Ala Arg Gly Leu Ser Pro Ala Tyr Leu Arg Phe Gly Gly 

85 90 95 

Thr lys Thr Asp Phe Leu lie Phe Asp Pro Lys Lys Glu Ser Thr Phe 

100 105 no 

Glu Glu Arg Ser Tyr Trp Gin Ser Gin Val Asn Gin Asp He Cys Lys 
115 120 125 

Tyr Gly Ser He Pro Pro Asp Val Glu Glu Lys Leu Arg Leu Glu Trp 
130 135 KO 

Pro Tyr Gin Glu Gin Leu Leu Leu Arg Glu His Tyr Gin Lys Lys Phe 
U5 150 155 160 

Lys Asn Ser Thr Tyr Ser Arg Ser Ser Val Asp Val Leu Tyr Thr Phe 

165 1 70 1 75 

Ala Asn Cys Ser Gly Leu Asp Leu He Phe Gly Leu Asn Ala Leu Leu 
ISO 185 190 

Arg Thr Ala Asp Leu Gin Trp Asn Ser Ser Asn Ala Gin Leu Leu Leu 
195 200 205 

Asp Tyr Cys Ser Ser Lys Gly Tyr Asn He Ser Trp Glu Leu Gly Asn 
210 215 220 

Glu Pro Asn Ser Phe Leu Lys Lys Ala Asp He Phe He Asn Gly Ser 
225 230 235 240 

Gin Leu Gly Glu Asp Tyr He Gin Leu His Lys Leu Leu Arg Lys Ser 

2A5 250 255 

Thr Phe Lys Asn Ala Lys Leu Tyr Gly Pro Asp Val Gly Gin Pro Arg 
260 265 270 

Arg Lys Thr Ala Lys Het Leu Lys Ser Phe Leu Lys Ala Gly Gly Glu 
275 280 285 

Val He Asp Ser Val Thr Trp His His Tyr Tyr Leu Asn Gly Arg Thr 
290 295 300 

Ala Thr Arg Glu Asp Phe Leu Asn Pro Asp Val Leu Asp He Phe He 
305 310 315 320 

Ser Ser Val Gin Lys Val Phe Gin Val Val Glu Ser Thr Arg Pro Gly 

325 330 335 

Lys Lys Val Trp Leu Gly Glu Thr Ser Ser Ala Tyr Gly Gly Gly Ala 
340 345 350 

Pro Leu Leu Ser Asp Thr Phe Ala Ala Gly Phe Met Trp Leu Asp Lys 
355 360 365 

Leo Cly Leu Scr Ala Arg Met Gly lie Glu Vat Val Met Arg Gin Vol 
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IV 



370 



375 



360 



Phe Phe Gly Ala Gly Asn 7yr His Leu Val Asp Glu Asn Phe Asp Pro 
355 390 395 AGO 

Leu Pro Asp Tyr Trp Leu Ser Leu Leu Phe Lys Lys Leu Val Gly Thr 

405 *10 415 

Lys Val Leu Met Ala Ser Val Gin Gly Ser Lys Arg Arg Lys Leu Arg 
420 425 430 

Val Tyr Leu His Cys Thr Asn Thr Asp Asn Pro Arg Tyr Lys Glu Gly 
435 440 445 



Asp Leu Thr Leu Tyr Ala lie Asn Leu His Asn Val Thr Lys Tyr Leu 
450 455 460 

Arg Leu Pro Tyr Pro Phe Ser Asn Lys Gin Val Asp Lys Tyr Leu Leu 

465 470 475 480 

Arg Pro Leu Gly Pro His Gly Leu Leu Ser Lys Ser Val Gin Leu Asn 

485 490 495 



Gly Leu Thr Leu Lys Met Val Asp Asp Gin Thr Leu Pro Pro Leu Met 
500 505 510 

Glu Lys Pro Leu Arg Pro Gly Ser Ser Leu Gly Leu Pro Ala Phe Ser 
515 520 525 



Tyr Ser Phe Phe Val lie Arg Asn Ala Lys Val Ala Ala Cys He 
530 535 540 543 

(2) INFORMATION FOR SEO ID N0:11: 

(1) SEQUEMCE CHARACTERISTICS: 

(A) LENGTH: 1718 

(B) TYPE: nucleic acid 

(C) * STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUEKCE DESCRIPTION: SEQ ID K0:11: 

CT AGA GCT TTC GAC 14 



TCT CCG CTG CGC GGC AGC TGG CGG GGG GAG CAG CCA CGT GAG CCC AAG 62 

ATG CTG CTG CGC TCG AAG CCT GCG CTG CCG CCG CCG CTG ATG CTG CTG 110 
Met Leu Leu Arg Ser Lys Pro Ala Leu Pro Pro Pro Leu Met Leu Leu 

5 10 15 



CTC CTG GGG CCG CTG GGT CCC CTC TCC CCT GGC GCC CTG CCC CGA CCT 158 
Leu Leu Gly Pro Leu Gly Pro Leu Ser Pro Gly Ala Leu Pro Arg Pro 
20 25 30 

GCG CAA GCA CAG GAC GTC GTG GAC CTG GAC TTC TTC ACC CAG GAG CCG 206 
Ala Gin Ala Gin Asp Val Val Asp Leu Asp Phe Phe Thr Gin Glu Pro 
35 AO 45 

CTG CAC CTG GTG AGC CCC TCG TTC CTG TCC GTC ACC ATT GAC GCC AAC 254 
Leu His Leu Val Ser Pro Ser Phe Leu Ser Val Thr lie Asp Ala Asn 
50 55 60 

CTG GCC ACG GAC CCG CGG TTC CTC ATC CTC CTG GGT TCT CCA AAG CTT 302 
Leu Ala Thr Asp Pro Arg Phe Leu lie Leu Leu Gly Ser Pro Lys Leu 
65 70 75 80 

CGT ACC TTG GCC AGA GGC TTG TCT CCT GCG TAC CTG AGG TTT GGT GGC 350 
Arg Thr Leu Ala Arg Gly Leu Ser Pro Ala Tyr Leu Arg Phe Gly Gly 

85 90 95 

ACC AAG ACA GAC TTC CTA ATT TTC GAT CCC AAC AAG GAA TCA ACC TTT 398 
Thr Lys Thr Asp Phe Leu lie Phe Asp Pro Lys Lys Glu Ser Thr Phe 
100 105 110 



CAA CAG AGA ACT TAC TCC CAA TCT CAA GTC AAC CAG GAT ATT TGC AAA 
Glu Glu Arg Ser Tyr Trp Cln Ser Gin Vat Asn Gin Asp Me Cys Lys 
115 120 125 
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V 

TAT GGA TCC ATC CCT CCT GAT GTG GAG GAG AAG TTA CGG TTG GAA TGG 
Tyr Gly Ser lie Pro Pro Asp Val Clu Glu Lys Leu Arg Leu Glu Trp 
130 B5 UO 

CCC TAG CAG GAG CAA TTG CTA CTC CGA GAA CAC TAG CAG AAA AAG TTC 5A2 
Pro Tyr Gin Glu Gin Leu Leu Leu Arg Glu His Tyr Gin Lys Lys Phe 
145 - 150 155 160 

AAG AAC AGC ACC TAG TCA AGA AGC TCT GTA GAT GTG CTA TAC ACT TTT 590 
Lys Asn Ser Thr Tyr Ser Arg Scr Ser Val Asp Val Leu Tyr Thr Phe 

165 170 175 

GCA AAC TGC TCA GGA CTG GAC TTG ATC TTT GGC CTA AAT GCG TTA TTA 636 
Ala Asn Cys Ser Cly Leu Asp Leu lie Phe Gly Leu Asn Ala Leu Leu 
180 185 190 

AGA ACA GCA GAT TTG CAG TGG AAC ACT TCT AAT GCT CAG TTG CTC CTG 686 
Arg Thr Ala Asp Leu Gin Trp Asn Ser Ser Asn Ala Gin Leu Leu Leu 
195 200 205 

GAC TAC TGC TCT TCC AAG GGG TAT AAC ATT TCT TGG GAA CTA GGC AAT 734 
Asp Tyr Cys Ser Scr Lys Gly Tyr Asn He Ser Trp Glu Leu Gly Asn 
210 215 220 

GAA CCT AAC AGT TTC CTT AAG AAG GCT GAT ATT TTC ATC AAT GGG TCC 782 
Glu Pro Asn Ser Phe Leu Lys Lys Ala Asp He Phe He Asn Gly Ser 
225 230 235 2A0 

CAG TTA GGA GAA GAT TAT ATT CAA TTG CAT AAA CTT CTA AGA AAG TCC 830 
Gin Leu Gly Glu Asp Tyr He Gin Leu His Lys Leu Leu Arg Lys Ser 

245 250 255 

ACC TTC AAA AAT GCA AAA CTC TAT GGT CCT GAT GTT GGT CAG CCT CGA 578 
Thr Phe Lys Asn Ala Lys Leu Tyr Gly Pro Asp Val Gly Gin Pro Arg 
260 265 270 

AGA AAC ACG GCT AAG ATG CTG AAG AGC TTC CTG AAG GCT GGT GGA GAA 926 
Arg Lys Thr Ala Lys Met Leu Lys Ser Phe Leu Lys Ala Gly Gly Glu 
275 280 285 

GTG ATT GAT TCA GTT ACA TGG CAT CAC TAC TAT TTG AAT GGA CGG ACT 974 
Val lie Asp Ser Val Thr Trp His His Tyr Tyr Leu Asn Gly Arg Thr 
290 295 300 

GCT ACC AGG GAA GAT TTT CTA AAC CCT GAT GTA TTG GAC ATT TTT ATT 1019 
Ala Thr Arg Glu Asp Phe Leu Asn Pro Asp Vat Leu Asp He Phe He 
305 310 315 320 

TCA TCT GTG CAA AAA GTT TTC CAG GTG GTT GAG AGC ACC AGG CCT GGC 1067 
Ser Ser Val Gin Lys Val Phe Gin Val Val Glu Ser Thr Arg Pro Cty 

325 330 335 

AAG AAG GTC TGG TTA GGA GAA ACA AGC TCT GCA TAT GGA GGC GGA GCG 1115 
Lys Lys Val Trp Leu Gly Glu Thr Ser Ser Ala Tyr Gly Gly Gly Ala 
340 345 350 

CCC TTG CTA TCC GAC ACC TTT GCA GCT GGC TTT ATG TGG CTG GAT AAA 1163 
Pro Leu Leu Ser Asp Thr Phe Ala Ala Gly Phe Ket Trp Leu Asp Lys 
355 360 365 

TTG GGC CTG TCA GCC CGA ATG GGA ATA GAA GTG GTG ATG AGG CAA GTA 1211 
Leu Gly Leu Ser Ala Arg Met Gly He Gto Val Val Met Arg Gin Val 
370 375 380 

TTC TTT GGA GCA GGA AAC TAC CAT TTA GTG GAT GAA AAC TTC GAT CCT 1259 
Phe Phe Gly Ala Gly Asn Tyr His leu Val Asp Glu Asn Phe Asp Pro 
385 390 395 AOO 

TTA CCT GAT TAT TGG CTA TCT CTT CFG TTC AAG AAA TTG GTG GGC ACC 1307 
Leu Pro Asp Tyr Trp Leu Ser Leu leu Phe Lys Lys Leu Val Gly Thr 

405 410 415 

f^AC CTG TTA ATG GCA ACC CTC CAA GCT ICA AAC ACA AGG AAG CTT CCA 1355 
Lys Va( Leu Hct Ata Ser Val Gin Gly Scr Lys Arg Arg Lys Leu Arg 
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A20 



425 



VI 



430 



GTA TAC CTT CAT TGC ACA AAC ACT GAC AAT CCA AGG TAT AAA GAA GGA U03 
Val Tyr Leu His Cys Thr Asn Thr Asp Asn Pro Arg Tyr Lys Glu Gly 
435 440 445 

GAT TTA ACT CTG TAT GCC ATA AAC CTC CAT AAC GTC ACC AAG TAC TTG 1451 
Asp Leu Thr Leu Tyr Ala He Asn Leu His Asn Val Thr Lys Tyr Leu 
450 455 460 

CGG TTA CCC TAT CCT TTT TCT AAC AAG CAA GTG GAT AAA TAC CTT CTA 1499 
Arg Leu Pro Tyr Pro Phe Ser Asn Lys Gin Val Asp Lys Tyr Leu Leu 
465 470 475 480 

AGA CCT TTG GGA CCT CAT GGA TTA CTT TCC AAA TCT GTC CAA CTC AAT 1547 
Arg Pro Leu Gly Pro His Gly Leu Leu Ser Lys Ser Val Gin Leu Asn 

485 490 495 

GOT CTA ACT CTA AAG ATG GTG GAT GAT CAA ACC TTG CCA CCT TTA ATG 1595 
Gly Leu Thr Leu Lys Met Val Asp Asp Gin Thr Leu Pro Pro Leu Met 
500 505 510 

GAA AAA CCT CTC CGG CCA GGA AGT TCA CTG GGC TTG CCA GCT TTC TCA 1643 
Glu Lys Pro Leu Arg Pro Gly Ser Ser Leu Gly Leu Pro Ala Phe Ser 
515 520 525 

TAT AGT TTT TTT GTG ATA AGA AAT GCC AAA GTT GCT GCT TGC ATC TGA 1691 
Tyr Ser Phe Phe Val lie Arg Asn Ala Lys Val Ala Ala Cys He 
530 535 540 543 



AAA TAA AAT ATA CTA GTC CTG ACA CTG 



1718 



(2) 



INFORMATION FOR SEQ JD K0:12: 



(i) 



(XI) 



SEQUEHCE CHARACTERISTICS: 



<A) 
(B) 
(C) 
(D) 



LENGTH: 
TYPE: 

STRANDEDNESS: 
TOPOLOGY : 



SEQUENCE DESCRIPTION 



824 

nucleic acid 

double 

I inear 

SEO ID NO: 12 



CTGGCAAGAA 
TGTCCAACAC 
TGGGCATAGA 
ATGAAAACTT 
GTCCCAGGGT 
TCCACTGCAC 
TGAACCTCCA 
TGGATACGTA 
TGAACGGTCA 
CTCTCCCCGC 
GAAATGCCAA 
AAGCCGAGGG 
GAGTTCCAGA 
CTCTAAGAAG 



GGTCTGGTTG 
CTTTGCAGCT 
AGTCGTGATG 
TGAGCCTTTA 
GTTACTGTCA 
TAACGTCTAT 
TAATGTCACC 
CCTTCTCAAG 
AATTCTGAAG 
AGGAAGTGCA 
AATCGCTGCT 
GGGTGTTATT 
GCTTCGGGAG 
AATACTGCAG 



GGAGAGACGA 
GGCTTTATGT 
AGGCAGGTGT 
CCT GAT TACT 
AGAGTGAAAG 
CACCCACGAT 
AAGCACTTGA 
CCTTCGGGGC 
ATGGTGGATG 
CTAAGCCTGC 
TGTATATGAA 
CATAAAACAA 
GGTGGGGTAC 
GTGGTGACAG 



GCTCAGCTTA 
GGCTGGATAA 
TCTTCGGA6C 
GGCTCTCTCT 
GCCCAGACAG 
ATCAGGAAGG 
AGGTACCGCC 
CGGATGGATT 
AGCAGACCCT 
CTCCCTTTTC 
AATAAAAGGC 
AACCCTAGTT 
ACTTCAGTAT 
TTAATAGCAC 



CGGTGGCGGT 
ATT6GGCCTG 
AGGCAACTAC 
TCTGTTCAAG 
GAGCAAACTC 
AGATCTAACT 
TCCGTTGTTC 
ACTTTCCAAA 
GCCAGCTTTG 
CTAT6GTTTT 
ATACGGTACC 
TAGGAGGCCA 
TACATTCAGT 
TGTG 



GCACCCTTGC 60 
TCAGCCCAGA 120 
CACTTAGTGG 180 
AAACTGGTAG 240 
CGAGTGTATC 300 
CTGTATGTCC 360 
AGGAAACCAG 420 
TCTGTCCAAC 480 
ACAGAAAAAC 540 
TTTGTCATAA 600 
CCTGAGACAA 660 
CCTCCTTGCC 720 
GTGGTGTTCT 780 
824 



(2) 



INFORMATION FOR SEQ ID NO: 13: 



(i) 



Cxi) 



SEQUENCE CHARACTERISTICS 
(A) LENGTH: 
<B) TYPE: 

(C) STRANDEDNESS: 

(D) TOPOLOGY: 
SEQUENCE DESCRIPTION: 



1899 

nucleic acid 

double 

t X near 

SEQ ID NO: 13 



GGGAAAGCGA 
CAGTGGGAGG 
AGGCGTAACG 
AGAGCTCTCG 
ATGCTGCTGC 
CTGGGTCCCC 
CTGGACTTCT 
ATTGACGCCA 
CCTACCTTGG 
TTCCTAATTT 
CAAGICAACC 
CCCTIGGAAT 



GCAAGGAAGT 
GATGCAGAAG 
GGGCGGAGGA 
ACTCTCCGCT 
GCTCGAAGCC 
TCTCCCCTGG 
TCACCCAGGA 
ACCTGGCCAC 
CCAGAGGCTT 
TCGATCCCAA 
ACGATATTTC 
CGCCCTACCA 



AGGAGAGAGC 
AGGAGTGGGA 
AAGGAGAAAA 
GCGCGGCAGC 
TGCGCTGCCG 
CGCCCTGCCC 
GCCGCTGCAC 
GGACCCCCGG 
GTCTCCTGCG 
GAAGGAATCA 
CAAATATGCA 
CGAGCAAnC 



CGGGCAGGCG 
GGGATGGAGG 
GGGCGCTGGG 
TGGCGGGGGG 
CCGCCGCTGA 
CCACCTGCCC 
CTGGTGAGCC 
TTCCTCATCC 
TACCTGAGGT 
ACCTTTCAAG 
TCCATCCCTC 
CTACTCCGAC 



GGGCGGGGTT 
GCGCAGTGGG 
GCTCGGCGGG 
AGCAGCCAGG 
TGCTGCTGCT 
AACCACAGGA 
CCTCGTTCCT 
TCCTGGGTTC 
TTGGTGGCAC 
AGAGAAGTTA 
CTGATGTGGA 
AACACTACCA 



GGATTGGGAG 60 

AGGGGTGAGG 120 

AGGAAGTGCT 180 

TGAGCCCAAG 240 

CCTGGGGCCG 300 

CGTCGTGGAC 360 

GTCCGTCACC 420 

TCCAAAGCTT 480 

CAAGACAGAC 540 

CTGGCAATCT 600 

GGAGAAGTIA 660 

GAAAAAGTIC 720 



SUBSTITUTE SHEET (RULE 26) 
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VII 

AAGAACAGCA CCTACTCAAG AAGCTCTGTA GATGTGCTAT ACACTTTTCC AAACTGCTCA 780 

GGACTGGACT TGATCTTTGG CCTAAATGCG TTATTAAGAA CAGCAGATTT GCAGTGGAAC 840 

AGTTCTAATG CTCAGTTGCT CCTGGACTAC TGCTCTTCCA AGGGGTATAA CATTTCTTGG 900 

GAACTAGGCA A7GAACCTAA CAGTTTCCTT AAGAAGGCTG ATATTTTCAT CAATGGGTCG 960 

CAG7TAGGAG AAGATTATAT TCAATTGCAT AAACTTCTAA GAAAGTCCAC CTTCAAAAAT 1020 

GCAAAACTCT ATGGTCCTGA TGTTGGTCAG CCTCGAAGAA AGACGGCTAA GATGCTGAAG 1080 

AGCTTCCTGA AGGCTGGTGG AGAAGTGATT GATTCAGTTA CATGGCATCA CTACTATTTG 1 UO 

AATGGACGGA CTGCTACCAG GGAAGATTTT CTAAACCCTG ATGTATTGGA CATTTTTATT 1200 

TCATCTGTGC AAAAAGTTTT CCAGGTGGTT GAGAGCACCA GGCCTGGCAA GAAGGTC7GG 1260 

TTAGGAGAAA CAAGCTCTGC ATATGGAGGC GGAGCGCCCT TGCTATCCGA CACC7TTGCA 1320 

GCTGGCTTTA TGTGGCTGGA TAAATTGGGC CTGTCAGCCC GAATGGGAAT AGAAGTG6TG 1380 

ATGAGGCAAG TATTCTTTGG AGCAG6AAAC TACCATTTAG TGGATGAAAA CTTCGATCCT 1440 

TTACCTGATT ATTGGCTATC TCTTCTGTTC AAGAAATTGG TGGGCACCAA GGTG7TAATG 1500 

GCAAGCGTGC AAGGTTCAAA GAGAAGGAAG CTTCGAGTAT ACCTTCATTG CACAAACACT 1560 

GACAATCCAA GGTATAAAGA AGGAGATTTA ACTCTGTATG CCATAAACC7 CCATAACG7C 1620 

ACCAAG7AC7 7GCGGTTACC crA7CC77T7 7C7AACAAGC AAG7GGA7AA ATACCT7CTA 1680 

AGACC777GG GACC7CA7GG A77ACT7 7CC AAA7C7GTCC AACTCAA7GG TC7AAC7C7A 1740 

AAGA7GGTGG A7GA7CAAAC C77GCCACC7 77AATGGAAA AACCTC7CCG GCCAGGAAG7 1800 

7CAC7GGGCT 7GCCAGC777 C7CA7A7AG7 77T7T7G7GA 7AAGAAA7GC CAAAG77GC7 1860 

GC77GCA7C7 GAAAA7AAAA 7A7ACTAG7C C7GACAC7G 1899 



(2) INF0RHA7I0N FOR SEQ ID N0:14: 

(1) SEQUENCE CKARAC7ERIS7ICS: 

(A) LENG7H: 592 

<B) 7YPE: amino acid 

(C> S7RAKDEDNESS: singl 

(0) 7OP0L0GY: linear 

(xi) SEQUENCE 0ESCRIP710N: SEO ID NO:U 



Met Glu Gly Ala Val Gly 


Gly Val Arg Arg Arg Asn Gly Ala Glu 


5 


10 


15 


Glu Arg Arg Lys Gly Arg 


7rp Gly Ser Ala Gly Gly Ser 


Ala Arg 


20 


25 


30 


Ala Leu Asp Ser Pro Leu 


Arg Gly Ser 7rp Arg Gly Glu Gin Pro 


35 


40 


45 


Gly Glu Pro Lys Met Leu 


Leu Arg Ser Lys Pro Ala Leu 


Pro Pro 


50 


55 


60 


Pro Leu Met Leu Leu Leu 


Leu Gly Pro Leu Gly Pro Leu 


Ser Pro 


65 


70 


75 


Gly Ala Leu Pro Arg Pro 


Ala Gin Ala Gin Asp Val Val 


Asp Leu 


80 


85 


90 


Asp Phc Phe 7hr Gin Glu 


Pro Leu His Leu Val Ser Pro 


Ser Phe 


95 


100 


105 


Leu Ser Vat 7hr lie Asp Ala Asn Leu Ala 7hr Asp Pro Arg Phe 


110 


115 


120 


Leu lie Leu Leu Gly Ser Pro Lys Leu Arg 7hr Leu Ala Arg Gly 


125 


130 


135 


Leu Ser Pro Ala 7yr Leu Arg Phe Gly Gly 7hr Lys Thr Asp Phe 


140 


145 


150 


Leu lie Phc Asp Pro Lys Lys Glu Ser 7hr Phe Glu Glu Arg Ser 


155 


160 


165 


7yr Trp Gin Ser Gin Val 


Asn Gin Asp He Cys Lys 7yr Gly Ser 


170 


175 


180 


lie Pro Pro Asp Val Glu 


Glu Lys Leu Arg Leu Glu 7rp Pro 7yr 


185 


190 


195 


Gin Glu Gin Leu Leu Leu 


Arg Glu His Tyr Gtn Lys Lys 


Phe Lys 


200 


205 


210 


Asn Ser 7hr 7yr Ser Arg 


Ser Ser Val Asp Val Leu Tyr 


Thr Phe 


215 


220 


225 


Ala Asn Cys Ser Gty Leu Asp Leu lie Phe Gly Leu Asn Ala Leu 


230 


235 


240 


Leu Arg 7hr Ala Asp Leu 


Gin Trp Asn Ser Ser Asn Ala 


Gin Leu 


245 


250 


255 


Leu Leu Asp Tyr Cys Ser 


Ser Lys Gly Tyr Asn lie Ser 


Trp Glu 


260 


265 


270 


Leu Gly Asn Glu Pro Asn 


Ser Phe Leu Lys Lys Ala Asp 


lie Phe 


275 


280 


285 


He Asn Gly Ser Gin Leu 


Gly Glu Asp Tyr lie Gin Leu 


His Lys 


290 


295 


300 


Leu Leu Arg Lys Ser 7hr 


Phe Lys Asn Ala Lys Leu 7yr 


Gly Pro 


305 


310 


315 


Asp Vat Gly Gin Pro Arg 


Arg Lys Thr Ala Lys Met Leu 


Lys Ser 


320 


325 


330 


Phc Leu Lys Ala Gly Gly 


Glu val He Asp Ser Val Thr 


Trp His 


335 


340 


545 



SUBSTITUTE SHEET (RULE 26) 
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VIII 



His 


Tyr 


Tyr 


Leu 


Asn 


Gly 


Arg 


Thr 


Ala 


Thr Arg 


Glu Asp Phe leu 










350 










355 




360 


Asn 


Pro 


Asp 


Val 


Leu 


Asp 


I le 


Phe 


lie 


Ser Ser 


Val Gin 


Lys Val 








365 










370 




- 375 


Phe 


Gin 


Val 


Val 


Glu 


Ser 


Thr 


Arg 


Pro Gty Lys 


Lys Val 


Trp Leu 










380 










385 




390 


Gly 


Glu 


Thr 


Ser 


Ser 


Ala 


Tyr 


Gly 


Gly Gly Ala 


Pro Leu 


Leu Ser 








395 










400 




405 


Asp 


Thr 


Phe 


Ala 


Ala 


Gly 


Phe 


Met 


Trp Leu Asp 


Lys Leu Gly Leu 








410 










415 




420 


Ser 


Ala 


Arg 


Het 


Gly 
425 


He 


Glu 


Val 


Val 


Met Arg 
430 


Gin Val 


Phe Phe 
435 


Gly 


Ala 


Gly 


Asn 


Tyr 


His 


Leu 


Val 


Asp Glu Asn 


Phe Asp Pro Leu 










440 










445 




450 


Pro 


Asp 


Tyr 


Trp 


Leu 
455 


Ser 


Leu 


Leu 


Phe 


lys Lys 
460 


Leu Val 


Gly Thr 
465 


Lys 


Val 


Leu 


Het 


Ala 
470 


Ser 


Val 


Gin 


Gly 


Ser Lys 
475 


Arg Arg 


Lys Leu 
480 


Arg 


Val 


Tyr 


Leu 


His 


Cys 


Thr 


Asn 


Thr Asp Asn 


Pro Arg 


Tyr Lys 








485 










490 




495 


Glu 


Gly 


Asp 


Leu 


Thr 


Leu 


Tyr 


Ala 


lie 


Asn Leu 


His Asn 


Val Thr 








500 










505 




510 


Lys 


Tyr 


Leu 


Arg 


Leu 
515 


Pro 


Tyr 


Pro 


Phe 


Ser Asn 
520 


Lys Gin 


Val Asp 

525 


Lys 


Tyr 


Leu 


Leu 


Arg 
530 


Pro 


Leu 


Gly 


Pro 


His Gly 
535 


Leu Leu 


Ser lys 
540 


Ser 


Val 


Gin 


Leu 


Asn 


Gly 


Leu 


Thr 


Leu 


Lys Het 


Val Asp Asp Gin 










545 










550 




555 


Thr 


Leu 


Pro 


Pro 


Leu 
560 


Met 


Glu 


Lys 


Pro 


Leu Arg 
565 


Pro Gly 


Ser Ser 
570 


Leu 


Gly 


Leu 


Pro 


Ala 
575 


Phe 


Ser 


Tyr 


Ser 


Phe Phe 
580 


Vat He 


Arg Asn 
585 


Ala 


Lys 


Val 


Ala 


Ala 
590 


Cys 


lie 

592 













(2) IWFORMATIOH FOR SEQ ID M0:15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LEMGTH: 1899 

(B) TYPE: nucleic acid 

(C) STRANDEDHESS: double 

(D) TOPOLOGY: linear 

(Xi) SEQUEHCE DESCRIPTIOM: SEQ ID NO: 15 



GGG 3 



AAA 


GCG 


AGC 


AAG 


GAA 


GTA 


GGA 


GAG 


AGC 


CGG 


GCA 


GGC 


GGG 


GCG 


CGG 


48 


TTG 


GAT 


TGG 


GAG 


CAG 


TGG 


GAG 


GGA 


TGC 


AGA 


AGA 


GGA 


GTG 


GGA 


GGG 


93 


ATG 


GAG 


GGC 


GCA 


GTG 


GGA 


GGG 


GTG 


AGG 


AGG 


CGT 


AAC 


GGG 


GCG 


GAG 


138 


Met 


Glu 


Gly 


Ala 


Val 


Gly 


Gty 


Val 


Arg 


Arg 


Arg 


Asn Gly Ala 


Glu 










5 










10 










15 




GAA 


AGG 


AGA 


AAA 


GGG 


CGC 


TGG 


GGC 


TCG 


GCG 


GGA 


GGA 


ACT 


GCT 


AGA 


183 


Glu 


Arg 


Arg 


Lys 


Gty 


Arg 


Trp 


Gly Ser Ala 


Gly 


Gly Ser Ala Arg 












20 










25 










30 




GCT 


CTC 


GAC 


TCT 


CCG 


CTG 


CGC 


GGC 


AGC 


TGG 


CGG 


GGG 


GAG 


CAG 


CCA 


228 


Ala 


Leu 


Asp 


Ser 


Pro 


Leu 


Arg 


Gly 


Ser 


Trp 


Arg 


Gly Gtu Gin Pro 










35 










40 










45 




GCT 


GAG 


ccc 


AAG 


ATG 


CTG 


CTG 


CGC 


TCG 


AAG 


CCT 


GCG 


CTG 


CCG 


CCG 


273 


Gly 


Glu 


Pro 


Lys 


Met 


Leu 


Leu 


Arg 


Ser 


Lys 


Pro 


Ala 


Leu 


Pro 


Pro 












50 










55 










60 




CCG 


CTG 


ATG 


CTG 


CTG 


CTC 


CTG 


GGG 


CCG 


CTG 


GGT 


CCC 


CTc 


TCC 


CCT 


318 


Pro 


Leu 


Het 


Leu 


Leu 


Leu 


L eu 


Gly 


Pro 


Leu 


Gly 


Pro 


Leu 


Ser 


Pro 












65 










70 










75 




GGC 


GCC 


CTG 


CCC 


CGA 


CCT 


GCG 


CAA 


GCA 


CAG 


GAC 


GTc 


GTG 


GAC 


CTG 


363 


Gly 


Ala 


Leu 


Pro 


Arg 


Pro 


Ata 


Gin 


Ala 


Gin 


Asp 


Vat 


Val 


Asp 


Leu 










80 










85 










90 




GAC 


TTc 


TTC 


ACC 


CAG 


GAG 


CCG 


CTG 


CAC 


CTG 


GTG 


AGC 


CCC 


TCG 


TTC 


408 


Asp 


Phe 


Phe 


7hr 


Gin 


Glu 


Pro 


leu 


His 


Leu 


Val 


Ser 


Pro 


Ser 


Phe 










95 










100 










105 




CIG 


ICC 


CTC 


ACC 


ATT 


GAC 


GCC 


AAC 


CTG 


CCC 


ACC 


CAC 


CCC 


CCC 


TTC 


453 
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IX 

Leu Ser Val Thr lie Asp Ata Asn Leu Ala Thr Asp Pro Arg Phe 

no 115 120 



CTC ATC CTC CTG GGT TCT CCA AAG CTT CGT ACC 7TG GCC AGA GGC 
Leu lie Leu Leu Gly Ser Pro Lys Leu Arg Thr Leu Ala Arg Gly 

125 130 135 



493 



TTG TCT CCT GCG TAC CTG AGG TTT GGT GGC ACC AAG ACA GAC TTC 
Leu Ser Pro Ala Tyr Leu Arg Phe Gly Gly Thr Lys Thr Asp Phe 

UO 1A5 150 



543 



CTA ATT TTC GAT CCC AAG AAG GAA TCA ACC TTT GAA GAG AGA AGT 
Leu He Phe Asp Pro Lys Lys Glu Ser Thr Phe Glu Glu Arg Ser 

155 160 165 



588 



TAC TGG CAA TCT CAA GTC AAC CAG GAT ATT TGC AAA TAT GGA TCC 
Tyr Trp Gin Ser Gin Val Asn Gin Asp lie Cys Lys Tyr Gly Ser 

170 175 180 



633 



ATC CCT CCT GAT GTG GAG GAG AAG TTA CGG TTG GAA TGG CCC TAC 
lie Pro Pro Asp Val Glu Glu Lys Leu Arg Leu Glu Trp Pro Tyr 

185 190 195 



678 



CAG GAG CAA TTG CTA CTC CGA GAA CAC TAC CAG AAA AAG TTC AAG 
Gin Glu Gin Leu Leu Leu Arg Glu His Tyr Gin Lys Lys Phe Lys 

200 205 210 



723 



AAC AGC ACC TAC TCA AGA AGC TCT GTA' GAT GTG CTA TAC ACT TTT 768 

Asn Ser Thr Tyr Ser Arg Ser Ser Val Asp Vat Leu Tyr Thr Phe 

215 220 225 

GCA AAC TGC TCA GGA CTG GAC TTG ATC TTT GGC CTA AAT GCG TTA 813 

Ala Asn Cys Ser Gly Leu Asp Leu lie Phe Gly Leu Asn Ala Leu 

230 235 240 

TTA AGA ACA GCA CAT TTC CAG TGG AAC AGT TCT AAT GCT CAG TTG 858 

Leu Arg Thr Ala Asp Leu Gin Trp Asn Ser Ser Asn Ala Gin Leu 

245 250 255 

CTC CTG GAC TAC TGC TCT TCC AAG GGG TAT AAC ATT TCT TGG GAA 903 

Leu Leu Asp Tyr Cys Ser Ser Lys Gly Tyr Asn He Ser Trp Glu 

260 265 270 



CTA GGC AAT GAA CCT AAC AGT TTC CTT AAG AAG GCT GAT ATT TTC 
Leu Gly Asn Glu Pro Asn Ser Phe Leu Lys Lys Ala Asp He Phe 

275 280 285 



948 



ATC AAT GGG TCG CAG TTA GGA GAA GAT TAT ATT CAA TTG CAT AAA 
He Asn Gly Ser Gin Leu Gly Glu Asp Tyr He Gin Leu His Lys 

290 295 300 



993 



CTT CTA AGA AAG TCC ACC TTC AAA AAT GCA AAA CTC TAT GGT CCT 
Leu Leu Arg Lys Ser Thr Phe Lys Asn Ala Lys Leu Tyr Gly Pro 

305 310 315 



1038 



GAT GTT GGT CAG CCT CGA AGA AAG ACG GCT AAG ATG CTG AAG AGC 1083 
Asp Val Gly Gin Pro Arg Arg Lys Thr Ala Lys Met Leu Lys Ser 

320 325 330 

TTC CTG AAG GCT GGT GGA GAA GTG ATT GAT TCA GTT ACA TGG CAT 1128 
Phe Leu Lys Ala Gly Gly Glu Val He Asp Ser Val Thr Trp His 

335 340 345 

CAC TAC TAT TTG AAT GGA CGG ACT GCT ACC AGG GAA GAT TTT CTA 1173 
His Tyr Tyr Leu Asn Gly Arg Thr Ala Thr Arg Glu Asp Phe Leu 

350 355 360 

AAC CCT GAT GTA TTG GAC ATT TTT ATT TCA TCT GTG CAA AAA GTT 1218 
Asn Pro Asp Val Leu Asp He Phe He Ser Ser Vol Gin Lys Val 

365 370 375 

TTC CAG CTG GTT GAG ACC ACC AGG CCT GGC AAG AAG GTC TGG TTA 1263 
Phe Gin Val Val Glu Ser Thr Arg Pro Gly Lys Lys Val Trp Leu 

380 385 390 
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GGA CAA ACA AGC TCT GCA TAT GGA GGC GGA GCG CCC TTG CTA TCC 
Gly Glu Thr Ser Ser Ala Tyr Gly Gly Gly Ala Pro Leu Leu Ser 

395 400 405 



1308 



GAC ACC TTT GCA GCT GGC TTT ATG TGG CTG GAT AAA TTG GGC CTG 
Asp Thr Phe Ata Ala Gly Phe Met Trp Leu Asp Lys Leu Gly Leu 

410 415 420 



1353 



TCA GCC CGA ATG GGA ATA gAA GTG GTG ATG AGG CAA GTA TTC TTT 
Ser Ala Arg Met Gly lie Glu Val Val Met Arg Gin Val Phe Phe 

4Z5 430 435 



1398 



GGA GCA GGA AAC TAC CAT TTA GTG GAT GAA AAC TTC GAT CCT TTA 
Gly Ata Gly Asn Tyr His Leu Val Asp Glu Asn Phe Asp Pro Leu 

440 445 450 



1443 



CCT GAT TAT TGG CTA TCT CTT CTG TTC AAG AAA TTG GTG GGC ACC 
Pro Asp Tyr Trp Leu Ser Leu Leu Phe Lys Lys Leu Val Gly Thr 

455 460 465 



1488 



AAG CTG TTA ATG GCA AGC GTG CAA GGT TCA AAG AGA AGG AAG CTT 
Lys Val Leu Met Ala Ser Vat Gin Gly Ser Lys Arg Arg Lys Leu 

470 475 480 



1553 



CGA GTA TAC CTT CAT TGC ACA AAC ACT GAC AAT CCA AGG TAT AAA 
Arg Val Tyr Leu His Cys Thr Asn Thr Asp Asn Pro Arg Tyr Lys 

485 490 495 



1576 



GAA GGA GAT TTA ACT CTG TAT GCC ATA AAC CTC CAT AAC GTC ACC 
Glu Gly Asp Leu Thr Leu Tyr Ala lie Asn Leu His Asn Val Thr 

500 505 510 



1623 



AAG TAC TTG CG6 TTA CCC TAT CCT TTT TCT AAC AAG CAA GTG GAT 
Lys Tyr Leu Arg Leu Pro Tyr Pro Phe Ser Asn Lys Gin Val Asp 

515 520 525 



1668 



AAA TAC CTT CTA AGA CCT TTG GGA CCT CAT GGA TTA CTT TCC AAA 
Lys Tyr Leu Leu Arg Pro Leu Gly Pro His Gly Leu Leu Ser Lys 

530 535 540 



1713 



TCT GTC CAA CTC AAT GGT CTA ACT CTA AAG ATG GTG GAT GAT CAA 
Ser Vat Gin Leu Asn Gly Leu Thr Leu Lys Met Val Asp Asp Gin 

545 550 555 



1758 



ACC TTG CCA CCT TTA ATG GAA AAA CCT CTC CGG CCA GGA AGT TCA 
Thr Leu Pro Pro Leu Met Glu Lys Pro Leu Arg Pro Gly Ser Ser 

560 565 570 



1803 



CTG GGC TTG CCA GCT TTC TCA TAT AGT TTT TTT GTG ATA AGA AAT 
Leu Gly Leu Pro Ala Phe Ser Tyr Ser Phe Phe Val He Arg Asn 

575 580 585 



1848 



GCC AAA GTT GCT GCT TGC ATC TGA AAA TAA AAT ATA CTA GTC CTG 1893 

Ala Lys Val Ala Ala Cys He 

590 592 

ACA CTG 1899 



(2) 



INFORMATION FOR SEQ ID NO: 16: 



(»> 



SEQUENCE CHARACTERISTICS: 



(xi) 



(A) 
<B) 
<C) 
(D) 

SEQUENCE 



LENGTH: 
TYPE: 

STRANDEONESS: 
TOPOLOGY : 
DESCRIPTION: 



594 

nucleic acid 
double 
I inear 
SEQ ID NO: 



ATIACTATAG 
TAAAGAATTT 
TTTTTTCAGG 
GGCTGGCTCA 
TCCATTGGAG 
CGAGTCCGAA 
TCCGGGATCC 
CCGGGCGCTT 
G1GAACGTGA 
GGGGCGGGGT 



GGCACGCGTG 
TGGGTGGTTG 
CAAAAGTAAA 
AGTGACAAGC 
GCTTTACTCG 
ACCCTGGGTT 
CCAGCGCTGC 
GGATCCCGGC 
CCGCCACCGG 
TGGAT TGGGA 



GTCGACGGCC 
ATCTCTTTCC 
A T ACC T GAGA 
AAGTGTTTAT 
AGGGTCAGAG 
CCCACGAGAG 
TCCCCGGGCG 
CATCTCCGCA 
GGGGAAAGCC 
CCAGTGGGAC 



CGGGCTGGTA 
AGCTGCAGTT 
AACTGCCTGG 
AAGCTAGATG 
GGATACCCGG 
CGCGCAGAAC 
CTCCTCCCCG 
CCCTTCAAGT 
AGCAAGGAAG 
GGATGCAGAA 



16 



TTGTCTTAAT 
TAGCGTATGC 
CCAGAGGACA 
GGAGAGGAAG 
CGCCATCAGA 
ACGTGCGTCA 
GGCGCTCCTC 
GGGTGTGGGT 
TAGGAGAGAG 
GAGGAGTGGG 



GAGAAGTTGA 60 
TGAGGCCAGA 120 
ATCAGATTTT 180 
GGATGAATAC 240 
ATGGGATCTG 300 
GGAAGCCTGG 360 
CCCAGGCCTC 420 
GATTTCGTAA 480 
CCGGGCAGGC 5£.0 
AGOG 594 
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<2) 



INFORMATION FOR SEQ ID NO: 17: 
(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 

(B) TYPE: 

(C) STRANDEDNESS: 
<D) TOPOLOGY: 

(xi> SEQUENCE DESCRIPTION: 

CCCCAGGAGC AGCAGCATCA G 21 



XI 



21 

nucleic acid 
s i ng I e 
I inear 

SEQ ID NO: 17 



(2) 



INFORMATION FOR SEQ ID N0:18; 



(i> 



(xi) 



SEQUENCE CHARACTERISTICS: 



(A) 
(B) 
(C) 
(D) 



LENGTH: 
TYPE: 

STRANDEDMESS: 
TOPOLOGY: 



SEQUENCE DESCRIPTION: 
AGGCTTCGAG CGCAGCAGCA T 21 



21 

nuc I e i c acid 
s i ng I e 
I inear 

SEQ ID N0:18 



(2) 



INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 

(B) TYPE: 

(C) STRANDEDNESS: 

(D) TOPOLOGY: 
(xi) SEQUENCE DESCRIPTION: 

GTAATACGAC TCACTATAGG GC 22 



22 

nucleic acid 
s i ng I e 
I inear 

SEQ ID NO: 19 



(2) 



INFORMATION FOR SEQ ID N0:20: 
<i> SEQUENCE CHARACTERISTICS 

(A) LENGTH: 

(B) TYPE: 

<C> STRANDEDNESS: 
(0) TOPOLOGY: 
(xi) SEQUENCE DESCRIPTION: 
ACTATAGGGC ACGCGTGGT 19 



19 

nucleic acid 
s i ng I e 
I inear 

SEQ ID N0:20 



(2) 



INFORMATION FOR SEQ ID N0:21 



(i) 



(xi) 



SEQUENCE CHARACTERISTICS: 



(A) 
(B) 
<C) 
(0) 



LENGTH: 
TYPE: 

STRANDEDNESS: 
TOPOLOGY: 



SEQUENCE DESCRIPTION: 
CTT6GGCTCA CCTGGCT6CT C 21 



21 

nucleic acid 
s i ng I e 
I inear 

SEQ ID N0:21 



(2) 



INFORMATION FOR SEQ ID N0:22: 



(i) 



(xi) 



SEQUENCE CHARACTERISTICS: 



(A) 
(B) 
(C) 
CO) 



LENGTH: 
TYPE I 

STRANDEDNESS: 
TOPOLOGY : 



SEQUENCE DESCRIPTION; 



23 

nucleic acid 

single 

I inear 

SEQ ID NO:22 



AGCTCTGTAG ATGTGCTATA CAC 23 



<2) 



INFORMATION FOR SEQ ID N0;23 



(i) 



(xi ) 



SEQUENCE CHARACTERISTICS: 



(A) 
(B) 
(C) 
CD) 



LENGTH: 
TYPE: 

STRANDEDNESS: 
TOPOLOGY : 



SEQUENCE DESCRIPTION: 
GCATCTTAGC CGTCTTTCTT CG 22 



zz 

nucleic acid 

single 

I inear 

SEQ ID NO: 23 
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